AD-A116  767 


MOOOS  HOLE  0CEAN06R APH X C  INSTITUTION  MA 
COMPUTER  ST0RA6C  AND  RETRIEVAL  OF  POSITION 
JUN  62  R  C  6R0MAN 

UNCLASSIFIED  RH0I-S2-27  _ 


F/6  9/2 

DEPENDENT  DATA* (U) 
N0001A-79-C-0 071 
NL 


'•rvrrvi*^^ 


UNCLASSIFIED  6/82 


SCCUWITy  ClASSiriCATlON  OF  THIS  FAGF  flWian  Data  Cnfara«Q 


REPORT  DOCUMENTATION  PAGE 


1.  REPORT  NUMBER 

WHOI -82-27 


4.  TITLE  (end  Subtitle) 

COMPUTER  STORAGE  AND  RETRIEVAL  OF  POSITION  - 
DEPENDENT  DATA 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


).  RECIPIENT  S  CATALOG  NUMBER 


5.  TYPE  OP  REPORT  A  PERtOO  COVERED 


Technical 


S.  PERFORMING  ORG.  REPORT  NUMBER 


7.  AUTHORS 

Robert  Carl  Groman 


»  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Woods  Hole  Oceanographic  Institution 
Woods  Hole,  Massachusetts  02543 


1  I.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

NORDA/National  Space  Technology  Laboratory 
Bay  St.  Louis,  MS  39529 


MONITORING  AGENCY  NAME  A  ADDRESS///  different  /row  Controlling  Office) 


CONTRACT  OR  GRANT  NUMBER/*; 

"N00014-79-C-0071  ; 
N00014-82-C-001  9; 


10.  PROGRAM  ELEMENT,  PROJECT.  TASK 
AREA  A  WORK  UNIT  NUMBERS 

NR  083-004 


12.  REPORT  DATE 

June  1982 


It.  NUMBER  OF  PAGES 

139 


18.  SECURITY  CLASS,  (ol  thla  report) 


Unclassified 


IS.  DISTRIBUTION  STATEMENT  (of  thla  Report ) 

Approved  for  public  release;  distribution  unlimited. 


r  f . 


18-  SUPPLEMENTARY  NOTES 


This  report  should  be  cited  as:  Woods  Hole  Oceanog.  Inst.  Tech.  Rept. 
WHOI-82-27. 


19.  KEY  WORDS  (Continue  on  reweree  alda  it  neceeemry  and  identity  by  block  number) 

1.  Data  retrieval 

2.  Location-dependent  data 

3.  Geographical  retrieval 


20.  ABSTRACT  (Continue  on  reveree  aid*  It  neceeemry  and  Identity  by  block  number) 


See  reverse  side. 


OD  (  jAH*7j  1473  eoirioN  of  i  mov  st  i«  obsolete 


S/N  0  102* 0 14*  680  1  | 


UNCLASSIFIED  6/82 

SECURITY  CLASSIFICATION  OF  THIS  FAOC  fW>a«  Oa«a  tntund) 


WHOI -82-27 


COMPUTER  STORAGE  AND  RETRIEVAL  OF 
POSITION  -  DEPENDENT  DATA 


by 


Robert  Carl  Groman 


WOODS  HOLE  OCEANOGRAPHIC  INSTITUTION 
Woods  Hole,  Massachusetts  02543 


June  1982 


TECHNICAL  REPORT 


Prepared  for  the  Office  of  Naval  Research  under  Contracts 
N00014-79-C-0071 ;  NR  083-004  and  N00014-82-C-0019 ;  NR  083 - 
004  and  JOI}  Inc. 

Reproduction  in  whole  or  in  part  is  permitted  for  any  pur¬ 
pose  of  the  United  States  Government.  This  report  should 
be  cited  as:  Woods  Hole  Oceanog.  Inst.  Tech.  Rept.  WHOI- 
82-27. 

Approved  for  public  release;  distribution  unlimited. 

Approved  for  Distribution:  _ 

Earl  £.  Ha*s,  Chairman 
Departmenwof  Ocean  Engineering 


y1 


WOODS  HOLE  OCEANOGRAPHIC  INSTITUTION 
WOODS  HOLE.  MASSACHUSETTS  02543 

PREFACE 


This  thesis  covers  the  design  of  a  new  digital  database  system  to 
replace  the  merged  (observation  and  geographic  location)  record,  one  file 
per  cruise  leg,  magnetic  tape  based  system  which  served  the  Geology  and 
Geophysics  Department  from  1974  to  1981.  This  system  (described  in  WHOI 
Technical  Report  74-68  "The  Digital  Data  Library  System:  Library  Storage 
and  Retrieval  of  Digital  Geophysical  Data"  by  Robert  C.  Groman)  provided 
a  relatively  simple  procedure  for  areal  searching  of  our  bathymetric, 
geomagnetic  and  gravity  observations.  It  grew  out  of  the  resources 
available  to  the  staff  in  the  late  60's  -  16-bit  mini -computers  with 
paper  and  nagnetic  tape  drives  for  shipboard  acquisition;  and  an 
Institution  central  computer  with  limited  disk  storage  capacity.  It 
became  increasing  cumbersome  and  expensive  to  retrieve  data  from  small 
geographic  areas  due  to  tape  handling  required  and  the  sequential  data 
access . 

Realizing  that  convenient  and  rapid  access  to  the  information, 
together  with  ease  of  updating,  were  essential  to  the  continuing  use  and 
cost  justification  of  the  department  database,  and  that  its  sheer  size 
(greater  than  two  million  points)  made  its  use  via  the  older  system 
tedious.  Bob  Groman  started  in  1978  to  assess  the  options  for  a  new  one. 

We  were  also  interested  in  integrating  other  data  types,  including  our 
seismic  profiling  and  seafloor  samples.  He  began  with  a  series  of 
interviews  with  the  staff  and  students;  surveyed  systems  used  at  other 
laboratories  and  in  related  fields;  reviewed  published  computer 
databases;  and  looked  at  the  big  changes  in  computer  technology, 
especially  in  disk  drives.  Out  of  this  review  emerged  the  design  for  the 
database  treated  in  this  thesis,  which  has  now  been  implemented  on  one  of 
our  D. E. C.  VAX- 11/ 780  computers.  None  of  the  published  databases  had  the 
capabilities  required  and  so  it  was  necessary  to  design  one  permitting 
insertion  by  cruise  leg  and  retrieval  by  either  geographic  hounds  or 
cruise  leg.  Appendix  VI  gives  the  current  detailed  design  specifications. 

A  major  portion  of  the  costs  of  the  new  database  came  from  our  Ocean 
Industry  Program,  and  along  with  our  other  supporting  agencies,  we 
provide  the  member  companies  dial-up  access  to  the  data.  Some  support 
came  from  Office  of  Naval  Research  and  the  Ocean  Margin  Drilling 
contracts.  Partial  support  of  Groman 's  studies  at  Worcester  Polytechnic 
Institute  came  from  the  WHOI  Education  Office. 

We  are  pleased  to  have  this  new  facility  in  place,  and  would  like  hy 
this  report  to  share  it  with  others.  I  hope  that  readers  will  fee  1  free 
to  contact  Bob  Groman  for  further  information. 


Department  of  Geology  and  Geophysics 


ABSTRACT 


A  data  storage  and  retrieval  scheme  has  been  designed 
and  implemented  which  provides  cost  effective  and  easy  access 
to  location-dependent,  geophysical*  data.  The  system  is 
operational  on  a  Digital  Equipment  Corporation  VAX-ll/780 
computer.  Values  of  measured  and  computed  geophysical 
parameters,  such  as  geomagnetic  field,  water  depth  and 
gravity  field,  are  stored  in  the  library  system.  In  addi¬ 
tion,  information  about  the  data,  such  as  port  stops,  pro¬ 
ject  name  and  funding  agency  are  also  saved.  These  data 
are  available  to  a  time  sharing  computer  user,  validated 
to  use  the  software  package,  through  a  query  language 
designed  to  interact  with  this  data  library.  The  data  can 
be  searched  and  retrieved  both  sequentially  and  geographically. 
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FOREWORD 

A  computer  storage  and  retrieval  scheme  for 
location-dependent  data  has  been  implemented  and  is  described 
here.  Computer  routines  were  written  lo  store  hundreds  of 
thousands  of  oceanographic  values  in  direct  access  disk  files. 
Data  structures  were  designed  to  access  these  data  efficiently 
by  geographical  area  and  by  acquisition  platform  name. 

Chapter  I  provides  an  introduction  to  the  problem  and  gives 
a  brief  description  of  the  solution.  Chapter  2  gives  a  short 
description  of  existing  geographic  management  systems.  I  also 
review  the  objectives  and  features  of  data  base  management 
systems.  Chapter  3  describes  the  user  requirements  and  the 
user  characteristics  for  a  geophysical  library  system  in  more 
detaiL.  The  unique  characteristics  of  geophysical  data  are 
further  defined.  Other  design  criteria  such  as  hardware 
constraints  and  funding  support  are  also  covered.  Chapter  4 
describes  the  implementation  features  of  the  library  system 
including  a  description  of  the  storage  scheme  and  key 
algorithms.  Chapter  5  provides  a  summary  of  the  design 
specifications  and  a  comparison  between  this  new  scheme  and 


existing  systems. 


I  would  like  to  acknowledge  the  help  of  my  advisor  for  many 
useful  conversations  during  all  phases  of  this  work.  Donna 
Allison,  who  was  able  to  read  my  writing,  deserves  credit  for 
the  care  she  took  in  typing  this  thesis.  My  special  thanks  go 
to  my  wife,  Susan,  for  her  patience,  support  and  understanding 
that  were  needed  to  keep  both  of  us  going. 

The  Woods  Hole  Oceanographic  Institution's  Ocean  Industry 
Program,  the  Education  Office  and  the  Office  of  Naval  Research 
provided  support  for  some  of  this  work. 
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Chapter  I 

INTRODUCTION 

Data  management  of  oceanographic  data  shares  many  common 
features  with  data  management  in  other  scientific  and  business 
disciplines.  However,  there  are  some  unique  problems  and 
features  of  oceanographic  data  which  require  special 
attention.  For  example,  the  data,  collected  by  oceangoing 
ships,  are  position  and  time  dependent.  Because  of  the 
uncertainties  in  the  ship's  position  and  measuring  instruments, 
it  may  be  necessary  to  save  different  values  for  the  same 
physical  phenomenon.  Also,  the  number  of  measurements  that 
need  to  be  stored  and  later  retrieved  are  in  the  millions. 

Geophysical  measurements  including  'navigation',  'depths', 

1  geomagnet ics '  and  'gravity',  are  used  by  geologists, 
geophysicists  and  other  physical  scientists  to  understand  the 
basic  physical  processes  occurring  on  and  below  the  earth's 
surface.  In  order  to  take  advantage  of  the  increasing  amount 
of  geophysical  measurements  collected  by  oceanographic 
institutions,  government  agencies  and  private  research  groups, 
an  efficient  and  flexible  approach  to  data  storage  and 


retrieval  is  needed  (Hartman,  1978).  The  following  is  a 
description  of  a  system  I  designed  and  implemented  to  satisfy 
these  storage  and  retrieval  requirements. 

Problem  Description 

The  primary  attribute  of  geophysical  measurements  is  their 
position,  given  as  a  pair  of  latitude  and  longitude  values. 
These  two  numbers  taken  together  can  be  viewed  as  the  unique 
key  for  the  information  about  a  position  on  the  earth’s 
surface,  much  in  the  same  way  as  a  person’s  social  security 
number  fully  identifies  that  individual.  All  geophysical  data 
must  be  viewed  as  having  this  location  dependency.  The 
information  cannot  be  acquired,  stored,  retrieved  or  analyzed 
without  considering  at  what  geographical  location  the  data  were 
collected.  This  two  dimensional  aspect  to  geophysical  data 
requires  that  an  ordered  pair  of  numbers,  the  latitude  and 
longitude,  be  associated  with  each  data  value  if  the  values  are 
to  constitute  useful  measurements.  It  is  reasonable  to  ask 
whether  a  general  purpose  data  base  management  system  (Date, 
1976)  might  not  be  suitable  for  implementing  a  computer  based 
data  storage  and  retrieval  system  for  these  location-dependent 
data.  However,  the  nature  of  the  location  dependency,  the 
nature  of  the  data  parameters,  and  the  way  these  data  are 
accessed  and  used  make  such  a  solution  impractical. 


Geophysical  data  are  commonly  collected  by  the 
oceanographic  research  community  using  surface  ship6  or 
airplanes.  Because  the  acquisition  platform  is  usually  moving 
during  data  collection,  these  data  are  often  called  'underway 
data'.  Measurements  are  collected  directly  in  digital  or 
analog  form.  These  measurements  include  the  'corrected  depth', 
'total  geomagnetic  field',  'total  gravity  field',  'current 
velocity',  and  many  more.'*'  Information  is  collected 
continuously  with  a  repetition  rate  that  depends  on  the  rate  of 
change  of  the  particular  attribute.  For  example,  the  'total 
geomagnetic  field*  is  typically  recorded  aboard  ship  every  five 
minutes  over  deep  water  (greater  than  1000  meters)  but  every 
one  minute  over  shallower  water  since  the  measured  values  vary 
faster  in  shallower  water.  Other  data  parameters  have 
acquisition  rates  ranging  from  every  ten  seconds  (e.g.  ship's 
velocity)  to  every  few  hours  (e.g.  bird  sightings). 

Additional  Information  is  derived  by  further  data 
processing  or  combining  two  or  more  of  the  measurements.  For 
example,  the  'free  air  anomaly'  is  obtained  by  subtracting  the 
'total  gravity  field'  from  a  theoretically  computed  'regional 
gravity  field'.  Also,  new  measurements  may  evolve  in  future 
times.  For  example,  one  new  procedure  is  to  add  a  second 

1  Appendix  I  provides  an  explanation  of  many  of  the 
oceanographic  terms  used  here. 


magnetometer  (an  instrument  used  to  measure  f geomagnetic 
field’)  towed  a  fixed  distance  from  the  first  magnetometer* 
This  arrangement  allows  a  ’gradient  geomagnetic  value’  to  be 
computed  and  is  used  to  further  understand  the  subbottom  sea 
structure# 

As  is  evident  from  the  above,  a  data  storage  scheme  for 
geophysical  data  must  be  able  to  manage  a  large  number  of 
different  parameters  at  each  geographical  position.  In 
addition,  because  of  the  variability  in  acquisition  rates  and 
the  needs  of  the  individual  researchers,  the  data  storage 
scheme  must  allow  for  missing  data  parameters* 

The  data  storage  scheme  is  further  complicated  by  the 
retrieval  requirements  that  researchers  place  on  a  geophysical 
digital  data  library*  Investigators  often  make  requests  about 
the  existence  of  geophysical  data  as  well  as  requests  to 
retrieve  these  data  within  certain  geographical  boundaries. 
Since  data  are  collected  in  chronological  order,  a  request  for 
chronological  retrieval  is  also  routine* 

Data  retrieval  by  chronological  order  is  used  during  the 
analysis  of  ocean  bottom  features*  Indeed,  the  original  cruise 
track  was  probably  chosen  to  simplify  this  type  of  post-cruise 
analysis*  A  data  structure  using  sequential  data  storage 
easily  satisfies  this  type  of  request*  However,  area  studies 
based  on  a  synthesis  of  data  from  many  different  cruises 
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require  that  data  be  accessible  by  geographic  area.  A  data 
structure  which  allows  direct  access  to  the  stored  values  is 
needed  to  avoid  lengthy,  sequential  searches  for  relevant 
information.  An  additional  requirement  is  that  data  retrieval 
cannot  be  expensive  since  certain  query  requests  can  return 
many  thousands  of  data  values. 

Overview  of  Solution 

The  system  which  I  implemented  and  which  is  described  here 
is  designed  to  satisfy  the  needs  of  the  oceanographic  research 
community  for  a  digital  storage  and  retrieval  system  for 
location-dependent  data.  The  key  features  include  the 
following: 

1.  A  data  dictionary  describes  each  element  stored  in  the 
library.  This  approach  allows  for  easy  growth  when  new  data 
attributes  are  added. 

2.  A  data  storage  scheme,  using  magnetic  disk  storage  and 
isolating  the  positional  information  from  the  secondary 
measurements,  provides  flexible  and  efficient  data  storage. 
Infrequently  used  data  can  be  removed  from  on-line  storage 
without  jeopardizing  the  ability  to  satisfy  requests  about  the 
data. 

3.  The  data  organization  facilitates  retrieval  within 
geographical  boundaries  and  yet  retains  a  sequential  access 
capability. 
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4.  Data  compaction  techniques  reduce  disk  storage  needs 
and  reduce  retrieval  costs. 

5.  An  inquiry  facility  provides  the  user  with  a 
comprehensive  set  of  commands  to  query  the  library  about  the 
information  stored  as  well  as  provide  interfaces  to  existing 
graphics  routines. 

A  crucial,  consideration  in  designing  a  data  retrieval 
system  is  how  the  data  will  be  used.  In  my  case,  there  may  be 
requests  for  profiles  and  charts,  and  summaries  of  data 
availability.  One  can  also  expect  that  new  applications  will 
introduce  new  location-dependent  parameters.  It  is  the 
position,  the  location  on  the  earth's  surface  at  which  the 
parameter  was  measured,  which  plays  a  central  role  in  these 
data  set 8.  This  position-dependency  leads  to  two  types  of 
relations i  a  near  neighbors  relation  and  a  sequential  neighbors 
re  lation. 

The  near  neighbor  relation  refers  to  the  fact  that  the 
study  of  oceanography  very  often  analyzes  the  relationship 
among  data  values  within  geographically  close  locations. 
Depending  on  the  features  under  investigation,  the  geographical 
area  can  span  a  few  hundred  square  kilometers  ('fracture  zone 
studies')  to  as  much  as  an  entire  ocean  (Mid-Atlantic  Ridge 
studies).  Data  obtained  by  different  researchers,  over  the 
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course  of  many  years,  are  often  studied  together  In  order  to 
piece  together  an  understanding  of  the  basic  physical  processes 
taking  place  on.  In,  and  below  the  ocean. 

The  sequential  neighbors  relation  occurs  because  data 
parameters  are  collected,  usually  aboard  a  sea-going  ship,  In 
sequence,  as  the  ship  moves  through  the  water.  Often,  the 
change  in  the  values  from  one  position  to  the  next  la  as 
Informative  as  the  absolute  value  of  the  collected  data.  This 
leads  to  a  requirement  that  data  be  accessible  in  the  same 
order  as  they  were  collected. 

These  two  relationships  form  the  basis  of  nearly  all 
existing  underway  geophysical  retrieval  requests  at  the  Woods 
Hole  Oceanographic  Institution.  It  is  not  surprising  then, 
that  I  have  designed  the  data  storage  scheme  to  facilitate 
retrieval  along  these  lines.  However,  I  have  not  lost  sight  of 
the  fact  that  requirements  evolve  and  new  data  types  are  added, 
and  I  have  included  features  in  my  design  to  allow  for  change. 


Chapter  II 


GENERAL  SOLUTION  TECHNIQUES 

There  are  always  advantages  and  disadvantages  in 
structuring  data  in  any  given  way.  Since  different  data  have 
different  characteristics,  this  should  affect  the  data 
organization  (Date,  1976  and  Martin,  1975).  Furthermore, 
specific  applications  for  data  will  often  require  unique  access 
methods.  For  example,  in  the  preparation  of  a  contour  plot, 
the  contouring  task  will  be  simplified  if  the  data  are 
presorted  into  an  evenly  spaced  grid.  However,  if  data  from 
many  different  sources  are  geographically  sorted,  retrieval 
from  a  single  source  is  not  enhanced.  Because  of  this 
variability  in  data  characteristics  and  data  applications,  it 
is  not  surprising  that  a  number  of  data  management  schemes  have 
been  produced.  In  this  chapter,  four  geophysical  data 
management  schemes  are  described  in  order  to  provide  additional 
insight  into  the  problem  of  storing  and  retrieving 
location-dependent  data.  I  begin  by  considering  the  features 
of  geographical  data  processing  schemes.  I  then  conclude  with 
a  discussion  of  the  four  geophysical  data  management  systems 
described  previously. 


Geographical  Data  Processing 

The  broad  area  of  geographical  data  processing  and 
management  is  of  interest  to  city  planners  who  want  to  manage 
city  growth,  acquisition  and  use  of  census  studies,  studies  of 
regional  land  use,  as  well  as  the  oceanographic  applications 
considered  here.  The  storage  schemes  used  by  geographical  data 
processing  systems  are  classified  by  Nagy  and  Wagle  into  two 
broad  categories:  data  banks  and  data  bases  (Nagy  and  Wagle, 
1979).  In  a  data  bank,  data  are  separated  according  to  some 
natural  intellectual  understanding  of  the  data  or  according  to 
a  subextent  (i.e.,  geographical)  basis  or  both.  In  general, 
data  banks  are  simpler  to  implement  than  data  bases  since  they 
are  specifically  designed  to  input,  output  and  process  data 
along  the  lines  of  the  data  separation.  Data  typically  are 
stored  sequentially  although  provision  for  selective  retrieval 
usually  i9  provided.  In  a  data  base  implementation,  data  are 
stored  along  with  inter-entity  relational  information.  These 
relations  enhance  the  utility  of  data  and  offer  other  benefits. 

Many  geographical  data  processing  systems  divide  the  data 
into  convenient  mana'geable  geographical  areas.  The 
Experimental  Cartographic  System  (ECS),  developed  in  the  later 
part  of  the  1960's,  uses  15  by  15  minute  areas  (Nagy  and  Wagle, 
1979).  The  Storage  and  Access  of  Network  Data  on  Rivers  and 
Drainage  Basins  system  (STANDARD)  divides  data  into  either  15 
or  7  1/2  minute  areas  for  processing  (Wagle,  1978).  More 
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flexibility  is  provided  by  the  Geodata  Analysis  and  Display 
System  (GADS),  which  used  a  relational  data  base  management 
system  (Nagy  and  Wagle,  1979)*  In  GADS  the  basic  geographical 
unit’s  size  and  shape  can  be  adjusted  to  each  application. 

These  and  other  geographical  data  processing  systems  do 
share  some  common  attributes  with  business  data  processing 
systems  (Nagy  and  Wagle,  1979).  Common  attributes  include  the 
presence  of  many  complex  data  Interrelationships,  the  need  for 
reliable  and  stable  production  settings,  and  a  tendency  toward 
interactive  systems.  However,  the  two  dimensional  quality  of 
the  data  destined  for  a  geographical  data  processing  system 
sets  it  apart  from  other  forms  of  information.  This  quality 
deeply  influences  the  input  and  output  operations,  algorithm 
design  and  especially . the  data  organization. 

Geophysical  Data  Management  Systems 

This  section  describes  four  existing  oceanographic 
geophysical  data  management  schemes.  Figure  1  provides  a 
summary  comparison  of  these  four  schemes.  Each  system  shares 
two  common  attributes:  they  make  use  of  some  form  of  data 
location  table  to  speed  access  to  stored  data;  and  they  rely  on 
sequential  access  to  the  stored  data. 
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The  Lamont-Doherty  Geological  Observatory  (LDGO)  stores 
'navigation1,  'corrected  depth',  'geomagnetics'  and  'gravity' 
measurements  on  removable  magnetic  disk  packs  (Weissel,  1979). 
Data  records  are  organized  by  cruise  leg,  a  typical  approach. 

A  cruise  leg  constitutes  one  continuous  acquisition  period,  by 
a  single  ship,  usually  from  one  port  stop  to  the  next.  Data 
collected  during  a  cruise  leg  are  stored  as  a  single  sequential 
file.  A  data  location  table,  stored  on  disk,  acts  as  an  index 
to  the  various  cruises  and  contains  information  about  the 
cruise  such  as  project  name,  start  and  end  dates,  and  the  name 
of  the  funding  agency.  The  location  table  contains  the 
geographical  boundaries  which  encompass  each  cruise  leg.  A 
retrieval  by  area  is  handled  by  searching  the  location  table  to 
determine  which  cruise  leg  boundaries  overlap  the  requested 
retrieval  area.  By  using  the  location  table,  a  preprocessor 
program  is  able  to  generate  an  input  parameter  file  for  various 
plotting  programs.  This  procedure  has  two  major  benefits: 

1)  it  hides  the  organization  and  complexity  of  the  data 
storage  system;  and 

2)  it  simplifies  running  of  plotting  routines  which, 
because  of  their  flexibility,  usually  require  complex 
and  numerous  run-time  parameters. 


A 
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Figure  1 


Comparison  of  Existing  Oceanographic  Geophysical 
Data  Management  Schemes 
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The  storage  and  retrieval  scheme  used  by  the  Scripps 
Insltutlon  of  Oceanography  (SIO)  also  uses  the  cruise  leg  as 
the  basic  organizational  unit  (Smith,  1979).  However,  unlike 
LDGO's  scheme,  each  cruise  leg  record  is  allowed  to  take  its 
own  form  within  the  cruise  leg  file.  That  is,  while  each 
logical  record  in  a  cruise  leg  will  have  the  same  format,  this 
format  need  not  be  the  same  format  as  used  in  another  cruise 
leg.  There  exists  a  file  definition  scheme  which  specifies  the 
length  and  contents  of  the  records  of  each  data  file. 

Obviously,  this  scheme  benefits  from  being  able  to  accept  new 
data  attributes.  The  SIO  scheme  further  differs  from  the  LDGO 
scheme  in  that  actual  data  records  reside  on  magnetic  tape 
rather  than  disk.  However,  like  LDGO,  access  to  the  cruise  leg 
files  is  facilitated  by  a  location  table  maintained  on  disk. 

The  geophysical  library  at  the  Woods  Hole  Oceanographic 
Institution  (WHOI)  uses  a  tape-based  data  library  approach 
(Groman,  19/A).  A  disk  file,  similar  to  LDGO's  location  file, 
stores  information  about  each  cruise  leg  in  addition  to 
information  about  where  data  are  stored.  The  retrieval  scheme 
searches  the  location  table  to  see  which  cruise  legs  fall  into 
the  area  of  Interest  and  provides  the  tape  names  and  file 


numbers  for  these  cruises 
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The  national  repository  for  marine  geology  and  geophysics 
data,  the  National  Geophysical  and  Solar-Terrestrial  Data 
Center  (NGSDC),  collects  and  manages  depth,  geomagnetic, 
gravity  and  other  data  provided  by  oceanographic,  educational, 
commercial,  and  other  government  organizations  (Bender,  1979). 
The  storage  scheme  used  at  NGSDC  stores  data  on  magnetic  tape 
but  maintains  a  disk-based  abbreviated  summary  of  cruise 
tracks.  With  this  data  organization  they  can  provide 
researchers  with  information  about  where  and  how  much  data 
exists.  However,  actual  data  are  not  stored  on-line  and  must 
be  retrieved  sequentially  from  magnetic  tape. 

In  all  of  these  schemes,  the  cruise  leg  is  the  basic  data 
storage  unit.  While  this  philosophy  allows  easy  and 
inexpensive  storage  and  retrieval  by  cruise  leg,  it  does  little 
to  facilitate  retrieval  by  geographical  boundaries.  In  each 
case,  once  a  particular  cruise  leg  is  identified  as  containing 
Information  within  the  specified  area,  every  record  in  the 
cruise  leg  file  must  be  accessed.  That  is,  the  basic  unit  of 
data  retrieval  is  the  whole  cruise  leg.  This  can  become  very 
inefficient,  especially  when  only  a  small  percentage  of  points 
within  the  ’cruise  leg*  are  actually  used. 
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Underway  geophysical  measurements  possess  many  unique 
features.  These  include: 

1.  the  two  dimensional  nature  of  the  data  -  each  data 
point  must  be  associated  with  a  geophysical  position; 

2.  large  amounts  of  data  -  a  cruise  leg  will  collect 
digital  data  at  5,000  to  35,000  distinct  geographical  positions; 

3.  non-uniform  collection  methods  -  acquisition  rates  may 
differ  for  each  parameter  measured;  and, 

4.  retrieval  requests  require  that  data  be  accessible 
sequentially  by  cruise  leg  and  geographically  by  geographical 
bounds. 

I  have  been  unable  to  locate  any  data  management  system 
that  successfully  handles  location-dependent  data  collected 
over  a  very  large  area.  The  schemes  described  above  attempt  to 
resolve  the  somewhat  incompatible  requirements  of  accessing 
data  both  sequentially  by  cruise  legs  and  randomly  by 
geographic  location. 
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Chapter  III 

MAJOR  LIBRARY  DESIGN  CONSIDERATIONS 

Many  factors  were  considered  during  the  design  of  this 
storage  and  retrieval  facility.  This  section  describes  six  of 
the  more  important  areas  including:  user  attributes,  data 
attributes,  adding  new  data,  modifications  and  deletions, 
computer  hardware,  and  retrieval  costs.  Appendix  II  outlines 
the  general  library  design  specifications. 

User  Attributes 

In  order  to  help  determine  the  retrieval  capabilities  that 
a  new  geophysical  data  management  scheme  should  possess,  I  took 
a  survey  of  potential  users  at  the  Woods  Hole  Oceanographic 
Institution.  These  users  were  asked  to  complete  a 
questionnaire  and,  in  some  cases,  users  were  interviewed  by  me 
in  person  or  by  phone. 

The  results  of  this  survey  indicate  that  easy  access  to 
stored  geophysical  data  is  the  primary  goal  although  easy 
access  does  not  necessarily  mean  immediate  access  to  all  data. 
The  system  should  handle  queries  about  the  existence  of  data  as 
well  as  support  the  various  graphical  display  options  currently 
available  (charts  and  profiles).  No  one  could  answer  questions 
about  their  query  frequency  with  precision.  Answers  ranged 


from  once  per  year  to  once  per  week.  However,  I  expect  that 
once  a  retrieval  scheme  is  implemented  which  provides 
researchers  with  enhanced  retrieval  and  graphical  tools,  the 
query  frequency  will  increase. 

The  discussions  with  potential  users  suggest  that  any  new 
scheme  should  allow  new  location-dependent  data  to  be  added  to 
the  library  without  extensive  changes.  This  is  evidenced  by 
the  interest  in  managing  other  data  types,  such  as  ’core’  and 
'dredge*  data,  'bottom  photographs',  'bottom  samples'  and 
'sedimentary  analysis'  data.  These  data  are  now  managed  using 
a  combination  of  computer  and  non-computer  techniques  and  it  is 
reasonable  to  expect  that  once  these  collections  outgrow  their 
present  management  systems  they  might  benefit  from  data 
management  software  similar  to  that  described  here. 

What  Data  Should  Be  Stored 

Part  of  the  preliminary  system  design  process  which  I 
conducted  included  a  study  of  the  attributes  which  should  be 
stored  in  a  library  of  underway  geophysical  measurements. 

Figure  2  summarizes  the  geophysical  data  types  and  quantitities 
currently  stored  by  WHOI's  existing  tape-based  system.  This 
storage  scheme  requires  162  megabytes  of  tape  storage,  although 
there  does  exist  considerable  data  redundancy.  I  estimate  that 
80  Mbytes  of  storage  would  be  needed  if  redundancies  were 


eliminated . 
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Figure  2 


Data  Quantities  at  WHO I 
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72 
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92 

82 
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Appendix  II  summarizes  the  data  attributes  which  are 
currently  considered  to  be  of  interest  to  geophysicists  (NOAA, 
197 7) •  Since  this  list  is  a  superset  of  the  data  now  available 
from  the  WHOI  tape  library,  it  is  clear  that  any  new  scheme 
would  require  at  least  80  Mbytes  for  data  storage* 

Adding  New  Data 

It  is  difficult  to  specify  how  often  newly  acquired 
geophysical  data  will  be  added  to  the  library.  When 
geophysicists  make  underway  geophysical  measurements  at  sea, 
new  data  are  generated  at  an  average  rate  of  1225 
"measurements"  per  day  or  10,000  sets  of  position-dependent 
records  per  ’cruise  leg’*  The  number  of  geophysical-type 
cruise  legs  per  year  at  WHOI  varies  between  0  and  20. 

It  is  a  unique  feature  of  geophysical  measurements  that 
there  may  exist  non-unique  values  for  measurements.  This  can 
happen  because  of  the  errors  inherent  in  the  navigational  and 
measuring  instruments.  The  library  must  be  able  to  save  all 
data  values,  even  if  it  results  in  conflicting  measurements. 


Modifications  and  Deletions 

The  rate  of  data  record  modification  (including  deletions) 
is  equally  difficult  to  determine.  Based  on  the  existing 
tape-based  library  there  would  be  less  than  a  1%  modification 
rate  to  a  ‘cruise  leg*  and  a  0%  deletion  rate.  Once  data  are 
collected  and  added  to  the  library,  they  are  never  completely 
removed . 

Computer  Hardware 

Any  computer  system,  as  long  as  it  supports  large  disk 
storage  with  a  random  access  capability  and  a  high  level 
programming  language,  could  be  a  target  machine  for  the  library 
system  described  here.  The  system  selected  was  a  Digital 
Equipment  Corporation’s  VAX-11/780  computer  because  it  was 
available  and  provided  disk  storage. 

Retrieval  Costs 

Part  of  my  design  time  was  spent  studying  plot  requests  for 
geophysical  data.  Figure  3  shows  a  summary  of  three  such 
requests.  Any  new  scheme  should  be  able  to  satisfy  this  type 
of  request  and,  it  is  hoped,  at  a  reduced  cost.  I  realize  this 
reduced  cost  on  my  new  scheme  by  increasing  the  hit  ratio. 
Although  storage  and  access  costs  are  higher  on  a  per  record 
basis  for  disk  versus  tape,  by  reducing  the  number  of  accesses 
needed  to  fulfill  a  request,  a  savings  is  realized. 


Figure  3 


Case  Study:  Data 
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*  Program  SEARCH  is  first  run  to  determine  which  cruise  legs 
are  likely  to  lie  in  the  area  specified.  The  hit  ratio  is 
defined  as  the  number  of  useful  records  decoded  (that  is, 
data  points  which  actually  fall  within  the  requested  area; 
divided  by  total  number  of  records  read,  deblocked  and 
decoded,  times  100. 

2  This  hit  ratio  is  computed  as  the  number  of  useful  records 
decoded  divided  by  the  total  number  of  logical  records  in 
the  library,  times  100*  This  figure  would  apply  if  no 
mechanism  were  used  to  preselect  cruise  legs. 

**  Total  cost  is  the  cost  to  retrieve  data  and  create  a  plot 
tape  used  for  off-line  plotting.  These  figures  are  based 
on  1979  computer  rates  at  Woods  Hole  Oceanographic 
Institution  for  the  Sigma  7. 


Data  which  are  accessed  infrequently  can  migrate  from  on-line 
disk  storage  to  lower  cost  tape  storage.  Based  on  experience 
with  Woods  Hole  Oceanographic  Institution's  existing  tape  based 
library,  I  anticipate  that  Less  than  50  percent  of  primary  data 
and  70  percent  of  secondary  data  need  be  on-line  at  any  one 
time,  although  some  information  about  each  'cruise  leg'  must 
always  be  available  on-line,  I  estimate  that  300  megabytes  are 
needed  for  on-line  disk  storage.  This  figure  is  based  on  an 
analysis  of  the  disk  storage  requirements.  Figure  4  summarizes 
this  analysis. 

Disk  storage  costs,  as  well  as  updating  and  maintenance 
costs,  can  be  charged  to  the  users  of  the  system  with  the 
record  keeping  and  charge-back  mechanism  which  I  have  built 
into  the  retrieval  software. 
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Figure  4 


Analysis  of  Disk  Storage  Requirements 


Assumptions : 

1.  151  cruise  legs  are  to  be  stored* 

2.  Provide  growth  for  300  cruise  legs* 

3.  Average  cruise  leg  is  28  days  long,  collecting  15,008 
position  records. 

4.  Average  ship  speed  is  6  knots. 

5.  Four  detail  groups  will  be  stored:  geophysical  detail, 
bathymetry  detail,  geomagnetic  detail  and  gravity 
detail  data. 

6.  Data  are  collected  in  every  10  degree  square. 

7.  Infrequently  used  data  can  migrate  to  tape  storage. 


Data  Structure 


Disk  Space  Required  (bytes) 
412,672 
31,186,944 
1,473,408  per  leg 
1,440,768  per  leg 


Ship  Table 
Bounds  Table 8* 

Primary  Data 
Secondary  Data 

Total  disk  space  for  all  151  cruise  legs:  471,640,000  bytes 


By  removing  Infrequently  used  data  from  disk  storage, 
the  on-line  storage  needs  are  satisfied  by  one  300  megabyte 
disk  drive. 


Each  bounds  table  provides  room  for  1000  entries.  Based  on 
the  assumptions  above,  it  takes  10  hours  to  traverse  a 
degree  square.  Hence  approximately  68  entries  are  needed 
per  cruise  leg.  Certain  bounds  tables  will  grow  faster 
than  others,  while  other  bounds  tables  will  not  be  needed 
at  all. 
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application  of  the  data  will  define  whether  data  as  close  as  1 
mi Le  or  100  miles  are  considered  to  be  "near".  For  example, 
locat ion-dependent  data  collected  for  census  studies  would 
resolve  distances  to  a  different  degree  than  an  analysis  of 
global  weather  patterns  (Alvarez  and  Taylor,  1974;  Nagy  and 
Wagle,  1979).  For  the  geophysical  data  library  implemented 
here,  a  one-degree  by  one-degree  area  (approximately  60 
nautical  miles  by  60  nautical  miles  at  the  equator)  defines  the 
basic  grid  size.  The  data  located  within  a  one  degree  area  can 
then  be  considered  as  near. 

Obviously,  this  division  into  one  degree  shapes  is 
arbitrary.  I  could  just  as  well  have  chosen  one  degree  shapes 
which  center  on  the  intersection  of  integral  latitude  and 
longitude  lines.  However,  my  choice  reflects  the  common 
approach.  A  more  important  decision  was  that  of  choosing  one 
degree  shapes  as  opposed  to  larger  or  smaller  geographical 
areas.  By  choosing  the  one  degree  division  I  balanced  the  need 
for  high  hit  ratios  during  data  retrieval  against  the 
maintenance,  storage  and  complexity  costs  inherent  in  choosing 
very  small  base  areas  (Wagle,  1978). 

The  organization  I  chose  to  implement  the  geographical 
proximity  relationship  is  a  data  structure  called  the  bounds 
table.  This  structure  views  the  earth’s  surface  as  10  degree 
by  10  degree  areas  for  a  total  of  648  areas  covering  the 
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earth.  Each  of  these  areas  is  further  divided  Into  1  degree  by 
1  degree  areas*  or  one  hundred  subareas  per  area.  Access  to 
data  by  geographical  area  is  accomplished  using  the  following 
steps : 

Step  1.  Choose  the  10  by  10  degree  area  based  on  the 
desired  geographical  position. 

Step  2.  Select  the  1  by  1  degree  area  within  the  10 
by  10  degree  defined  above. 

In  the  implementation  of  the  bounds  table  I  use  separate 
random  access  files  for  each  10  degree  by  10  degree  ai^a.  Step 
1  in  the  algorithm  is  a  mapping  from  a  latitude  and  longitude 
into  a  file  name.  The  sign  of  the  latitude  determines  whether 
the  file  name  contains  an  fNf  (north  for  positive)  or  'S' 

(south  for  negative)  in  position  4  of  the  file  name.  The  sign 
of  the  longitude  determines  whether  the  file  name  contains  an 
fEf  (east  for  positive)  or  'W1  (west  for  negative)  in  position 
6  of  the  name.  The  most  significant  digit  of  the  latitude  and 
the  two  most  significant  digits  of  the  longitude  are  used  in 
positions  5  and  7  through  8,  respectively,  to  uniquely  define 
the  bounds  table  file  names.  Figure  5  is  a  diagram  of  a  bounds 
table. 

Access  to  data  within  the  one  degree  squares  is 
accomplished  by  accessing  one  of  100  pointer  records  contained 


within  the  file. 
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Figure  5 


Bounds  Table 


The  name  of  the  bounds  table  uniquely  specifies  which  10 
degree  by  10  degree  area  is  being  accessed. 


Record  1 


Record  2^ 


Record  101 


Record  102^ 


Free  record  pointer, 
creation  data, 
date  last  updated 


Pointer  for  degree  1 


Pointer  for  degree  100 


* These  records  contain  a  count  of  the  total  number  of 
navigational  records  in  each  degree  square  plus  a  pointer 
to  the  start  of  individual  linked  lists  beginning  after 
record  101. 


^The  remaining  records  in  the  table  are  part  of  linked 
1 1 8 1 s  and  contain  records  which  store  the  total  number  of 
records  available  for  each  data  type  for  each  degree  square 
and  a  pointer  record  into  the  primary  data  files  each  time  a 
cruise  leg  enters  the  degree  square. 
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A  simple  formula.  Implementing  step  2  above,  Is  used  to  access 
this  information: 

linear  position  within  file  »  absolute  (latitude  units) 
times  10  +  absolute  (longitude  units)  +  2 

Additional  facts  are  maintained  in  these  tables  so  that  user 
queries  about  amounts  and  types  of  data  available  can  be 
answered  without  having  to  access  the  original  data  records. 
The  data  structure  used  within  a  file  is  a  linked  list.  It  is 
easy  to  maintain  and  update  and  provides  sufficient  response 
times  for  typical  queries.  A  single  bounds  table  file, 
containing  information  about  100  separate  degree  squares,  must 
be  at  least  101  records  long.  (Each  record  is  48  bytes  long.) 
One  additional  record  is  needed  each  time  a  cruise  leg  passes 
into  a  degree  square.  This  record  contains  a  pointer  to  the 
position  data.  A  separate  record  is  maintained  for  each  data 
type  in  order  to  record  how  many  data  points  exist  in  the 
degree  square. 

This  scheme  allows  for  the  fact  that  there  will  often  be 
many  10  degree  by  10  degree  areas  containing  no  data.  In  such 
a  case,  the  bounds  table  for  this  area  need  not  exist.  This 
reduces  the  overhead  due  to  sparse  data  collections. 
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The  second  relationship,  the  cruise  leg  organization  of  the 
data,  provided  the  basis  for  many  early  geophysical  data 
storage  schemes*  While  each  scheme  uses  its  own  way  to 
identify  a  collection  period,  either  by  project  name  or  ’cruise 
leg*,  the  period  invariably  describes  a  continuous  interval, 
beginning  and  ending  at  a  port  stop. 

In  order  to  support  the  cruise  leg  or  ’commonality  of 
collection  platform’  relationship,  the  library  scheme  provides 
a  ship  table  data  structure  similar  to  the  bounds  table 
described  earlier.  Besides  maintaining  a  pointer  to  the  first 
position  of  each  cruise,  the  ship  table  contains  information 
about  the  cruise,  including  the  unique  cruise  name,  start  and 
end  dates,  chief  scientist's  name,  number  of  points  collected, 
project  name  and  funding  agency.  This  information  is  used  to 
satisfy  ad  hoc  requests  by  researchers  and  management.  Figure 
6  is  a  diagram  of  the  ship  table. 

A  straightforward  hashing  algorithm  (Knuth,  1973; 

Serverance  and  Duhne,  1976)  is  used  to  store  and  retrieve 
information  from  the  ship  table.  This  provides  direct  access 
to  the  table  based  on  cruise  identifier  and  is  independent  of 
the  particular  approach  taken  to  identify  the  collection 
platform.  As  long  as  the  table  remains  less  than  eighty 
percent  full,  quick  access  of  this  information  is  almost 


guaranteed • 
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Record  1 


Record  2 


Record  // 
1  Modulo  * 


Figure  6 


Ship  Table 


First  pointer  to  alphabetical  linked  list, 
modulo  value  used  in  hashing  address 
computation,  next  free  record  in  second  part 
of  table. 


These  records  are  accessed  directly  using  the 
cruise  leg  identifier  in  a  hashing  algorithm. 
Each  record  contains  the  identifier,  next 
pointer  in  the  alphabetical  linked  list,  next 
pointer  for  cruise  leg  details  and  security 
code. 


Each  cruise  leg  which  is  identified  in  the 
first  part  of  the  table  maintains  its  own 
separate  linked  list  of  records  further 
describing  itself.  This  second  part  of  the 
table  stores  the  following  information 
about  each  cruise  leg:  port  stops,  chief 
scientist’s  name,  project  name,  start  and  end 
dates,  comments,  contributor,  the  number  of 
data  points  for  each  data  type  stored  for  the 
cruise  and  pointer  into  the  primary  data  file. 
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I  considered  other  approaches  in  implementing  the  geographical 
proximity  and  cruise  leg  organization  relationships.  For  example, 


the  information  contained  in  the  bounds  table  and  the  ship  table 
could  have  been  placed  together  in  one  data  structure.  This 
method  would  save  disk  storage  and  simplify  data  retrieval  for 
certain  types  of  user  queries.  However,  an  important  disadvantage 
is  that  there  is  a  loss  of  simplicity.  Adding  new  relationships 
would  overly  complicate  this  combined  data  structure. 

Dictionary  Structure 

The  dictionary  describes  the  attributes  of  each  data 
element  stored  in  the  library.  Using  a  dictionary  allows  for 
easier  growth  when  new  data  attributes  are  added,  as  well  as 
fostering  data  integrity.  This  latter  aspect  is  accomplished 
by  including  valid  range  values  in  the  dictionary  and  requiring 
each  data  attribute  to  be  in  this  range  before  it  is  stored. 

The  dictionary  also  provides  the  mechanism  to  reduce  data 
storage.  It  contains  the  information  necessary  to  decode 
stored,  compressed  data.  A  hash  addressing  technique  provides 
near  0(1)  access  to  entries  in  the  table. ^  Collisions  are 
handled  using  the  primary  area  overflow  technique  (Knuth,  1973). 


^The  notation  0(f(n))  is  read  "of  order  f  of  n" .  The  meaning 
here  is  that  data  access  depends  on  a  function  of  n,  the  number 
of  records  in  the  data  structure.  With  f(n)-l,  a  constant, 
data  access  is  Independent  of  the  number  of  records  in  the  data 
structure. 
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When  an  item  description  of  a  new  data  parameter  is  added 
to  the  dictionary,  it  ia  assigned  a  unique  prime  number.  The 
number  is  then  used  to  help  determine  what  data  attributes  are 
available  at  each  geographical  location  without  having  to 
access  these  data  individually.  This  is  accomplished  by 
maintaining  a  product  of  prime  numbers  at  each  geographical 
location.  A  simple  division  and  check  for  remainder  determines 
the  existence  of  data  at  a  location.  For  example,  assume  that 
the  'data  available'  product  for  the  location  under 
consideration  has  the  value  782  (the  product  of  the  prime 
numbers  2,  17  and  23).  A  user  wants  to  know  if  'free-air 
anomaly'  and  'geomagnetic  anomaly'  values  exist  at  this 
location.  The  dictionary  may  have  assigned  the  prime  numbers 
23  to  'free-air  anomaly*  and  19  to  'geomagnetic  anomaly'.  The 
retrieval  software  now  performs  two  'division  and  check  for 
remainder'  operations.  Since  782  divided  by  23  yields  34  and 
no  remainder  then  a  'free-air  anomaly’  value  has  been  stored 
and  is  available.  However,  782  divided  by  19  yields  41  with  a 
non-zero  remainder  of  3.  This  means  that  a  'geomagnetic 
anomaly'  value  has  not  been  stored  and  is  unavailable  for 
retrieval.  This  procedure  eliminates  the  need  for  the  system 
to  actually  access  secondary  storage  to  check  if  the  requested 


data  exists. 
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Data  Storage  Scheme 

I  have  classified  information  to  be  stored  in  the  library 
into  three  categories:  primary  data,  secondary  data  and 
pointers*  In  this  section  I  describe  how  these  data  are 
organised  to  provide  an  efficient,  cost  effective  and  flexible 
means  to  store  location-dependent  data* 

The  primary  data  include  date,  time,  latitude  and  longitude 
information.  These  data  are  organized  into  direct  access  files 
according  to  the  particular  one  degree  geographical  square  in 
which  the  data  reside.  Primary  data  are  stored  in  primary  data 
files.  I  selected  this  approach  as  it  provides: 

1.  growth  potential  for  other  than  geophysical  data, 

2.  the  ability  to  reduce  the  on-line  size  of  the 
library  while  affecting  a  minimum  number  of  users, 

3.  economic  data  storage,  and 

4.  simplicity. 

One  implementation  of  this  scheme  would  require  as  many  as 
64,800  files  If  data  existed  in  each  one  degree  area  of  the 
world.  Even  taking  into  account  my  relatively  sparse  data  set, 
this  would  be  too  many  files  for  the  operating  system. 

However,  in  the  implementation  selected,  only  four  files  are 
used  to  cover  the  world,  one  file  for  each  geographical 
quadrant  of  the  earth.  I  define  the  four  geographical 
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quadrants  as  the  northeast,  northwest,  southeast  and  southwest 
portions  of  the  earth's  surface.  With  this  definition,  Canada 
is  located  in  the  northwest  quadrant  and  Australia  is  located 
in  the  southeast  quadrant.  Within  each  quadrant  file,  data  can 
still  be  accessed  by  degree  because  of  the  bounds  table 
structure  described  earlier.  This  scheme  retains  the  desired 
retrieval  capabilities  without  undue  file  manipulation. 

Other  divisions  of  the  data  can  be  implemented,  if  the 
algorithm  which  converts  a  position  into  a  file  name  is 
changed.  I  retain  this  flexibility  by  using  a  direct  encoding 
of  the  latitude  and  longitude  values  to  represent  a  file  name. 

A  side  benefit  of  using  an  encoding  scheme  rather  than  store 
the  file  names  explicitly  is  that  less  storage  space  is  used. 

Ail  information  about  a  geographical  location  not 
considered  as  primary  data  is  called  secondary  data  and  is 
8 tored  in  a  separate  area,  the  secondary  data  files.  Among  the 
many  advantages  to  this  approach  are  the  following: 

1.  Adding  new  data  attributes  does  not  effect  the 
organization  of  the  primary  data. 

2.  Secondary  data  can  be  removed  entirely  from 
on-line  disk  storage  without  affecting  queries 
based  on  position  or  time. 


3.  Selected  portions  of  the  secondary  data  can 
migrate  on  and  off  the  disk  as  the  needs  for 
these  data  change* 

4.  The  data  storage  cell  size  need  not  be  the  same 
in  the  primary  and  secondary  files,  making  better 
use  of  disk  space* 

Although  the  grouping  of  data  within  a  storage  cell 
(consecutive  words  of  disk  storage)  in  the  secondary  data  files 
can  be  arbitrary,  data  which  are  known  to  be  retrieved  together 
are  placed  in  the  same  storage  cell  to  improve  retrieval  times. 

The  third  category  of  information  is  pointer  data* 

Pointers  are  used  to  link  related  information,  using  a  linked 
list  structure*  Pointer  fields  in  the  primary  data  files  exist 
for  each  relationship  in  which  position  records  may 
participate.  Currently,  these  relationships  are  the 
geographical  proximity  relation  and  the  cruise  leg  organization 
of  the  data*  A  linked  list  structure  is  also  used  in  the 
secondary  data  files  to  connect  data  attributes  acquired  at  the 


same  location  and  time 
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I  chose  this  approach  to  link  information  because: 

1.  new  data  attributes  can  be  easily  added; 

2.  newly  identified  relationships  can  be 
implemented  easily  by  adding  an  additional 
pointer  between  common  entities; 

3.  data  can  be  easily  deleted  from  a  linked  list;  and 

4.  lists  can  be  reordered  to  improve  retrieval. 

Functional  Components 

This  section  describes  major  functional  components  of  the 
library  system:  query  commands;  use  statistics,  interface  to 
existing  Mercator  chart  making  and  analysis  routines,  and  data 
security  and  integrity. 

The  query  commands  used  to  access  the  library  take  the 
following  form 

t  i  •  i  »  —““i 

.  .  •  •  .  • 

command  verb’./option  •  ...  I  ’object  !,  ! 

i  f  »  i  it 

•  *  .  «  #  . 

where  ’command  verb1  is  an  acceptable  command  action  verb  such 
as  DISPLAY,  DEFINE,  HELP,  LIST,  SHOW,  and  SET;  ’option1  is 
zero,  one,  or  more  verb  modifiers  separated  by  a  slash;  and 
object  is  zero,  one,  or  more  parameters  separated  by  a  comma. 
Appendix  4  is  a  complete  user’s  manual  and  describes  these 


commands  in  detail 


39. 


Figure  7  shows  a  sample  query  session.  Once  Che  user  is 
logged  onto  Che  computer,  the  query  program  is  initiated  with  a 
RUN  DBAO: [GNG]DBQUERY  command.  The  program  uses  two 
consecutive  “greater  than"  symbols  as  a  prompt  to  the  user.  To 
improve  readability,  all  input  entered  by  the  user  is 
underlined  here. 

The  first  command  shown  is  a  LIST/FULL  command.  This 
command  causes  the  system  to  retrieve  all  the  information  it 
has  about  the  specified  cruise  leg.  Note  that  a  summary  of  the 
data  available  for  the  cruise  is  included.  This  information  is 
readily  available  from  the  ship  table  and  involves  a  simple 
direct  lookup. 

The  second  command  shown  is  the  HELP  command,  used  here  to 
provide  instructions  as  to  use  of  the  DEFINE  command.  The  HELP 
facility  provides  a  user  with  basic  information  about  how  to 
use  the  retrieval  program.  The  third  command  shown  is  the 
DEFINE  command,  used  here  to  obtain  the  definition  and  certain 
attributes  of  the  dictionary  term  ’corrected  depth’.  Note  that 
this  same  dictionary  structure  is  used  by  the  data  insertion 
software  to  foster  data  integrity  by  verifying  the  type  and 
magnitude  of  values  being  inserted.  The  DEFINE  command  can  be 
used  with  the  ALL  option  to  obtain  an  alphabetical  list  of  all 
valid  dictionary  terms. 

•j 
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The  fourth  command  shown  is  the  SET  command,  which 
restricts  the  user* s  area  of  interest  to  the  geographical  area 
bounded  by  21  degrees  south  by  25  degrees  south  and  14  degrees 
east  by  20  degrees  east.  The  system  saves  this  user-supplied 
information  internally  for  use  during  retrieval.  The  fifth 
command,  DISPLAY,  results  in  a  summary  of  the  data  parameters 
available  within  these  bounds.  To  satify  this  request,  the 
system  searches  each  bounds  table  affected  by  the  user-selected 
geographical  bounds  and  computes  a  sum  of  all  data  parameters 
available.  Finally,  the  user  ends  the  query  session  by  issuing 
the  END  command.  The  system  responds  with  a  message  and  a 
summary  of  the  computer  resources  used  during  the  session. 
Additional  examples  of  retrieval  commands  are  Included  in  the 
user’s  manual,  Appendix  IV. 
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Figure  7 

Sample  Query  Session 


$  RUN  DBAO: [GNGJDBQUERY 

DBQUERY  -  VERSION  1.00  -  800CT29  08:12:21 


Hi!  With  this  program  you  can  request  information 
from  the  dictionary  and  library  files.  If  you  need 
assistance  with  any  of  the  commands  just  type  in  HELP 
for  more  information. 

»  LIST/FULL  ATLANTIS  II  67  5 

*0TPTSTBLF  message,  full  contents  of  ship  table  entry  follows: 

Cruise  leg  id:  ATLANTIS  II  67  5 

Start  date:  1972/  4/12  0.00  End  date:  1972/  5/  6  0.00 

Date  added:  1980/  5/13  1617.54 
Flags:  0  Security  code:  0  Reference  151 

Reserved:  0 

Project  name :  IDOE 
Chief  scientist:  MILLIMAN 
Port  stops:  WALVIS  BAY-WALVIS  BAY 
Contributor:  Woods  Hole  Oceanographic  Institution 
Comments:  SE  ATLANTIC  OCEAN 

Parameter  name  Number  of  Data  Points 

CORRECTED  DEPTH  1310 

POSITIONS )  1310 

»HELP  DEFINE 


DEFINE  Command: 

This  command  allows  you  to  retrieve  dictionary  term 
definitions  and  other  dictionary  information  from  the 
dictionary.  The  form  of  the  command  is 

DEFINE/OPTION  [TERMl ) , [TERM2 ] , . . . 

where  [TERMl],  [TERM2],  etc.  are  terms  that  you  wish  to 
have  defined.  If  the  terms  are  not  in  the  dictionary  a 
message  will  indicate  that. 
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Figure  7  (continued) 

OPTIONS: 

/ALL  generates  an  alphabetical  list  of  all 

dictionary  terms*  You  may  restrict 
this  list  by  specifying  the  starting 
and  ending  terms* 

/PARTIAL  Display  definition  plus  some  more 
information 

/WORD  Display  only  the  definition  (default) 
»DEFINE/ PARTIAL  CORRECTED  DEPTH 
CORRECTED  DEPTH: 

Water  depth  (  0)  corrected  for  transducer  depth,  sound 

velocity,  and  tides 

ID  #:  19  Data  file  Type:  secondary 

Starting  byte  position  in  group  #  2  is  9  (Bit  Position:  0) 

Original  variable  type:  real 

Variable  stored  as  type:  integer*4 

>> SET/BOUNDS  -21,-25,14,20 

» DISPLAY 

*PRTDATSUM  message,  summary  of  -21 

data  available  in  the  area  - 

Indicated  to  the  right.  I  l 

14  l  l  20 

f  f 

•  • 


-25 


Parameter  name 


Number  of  data  points 


POSITION(S) 
CORRECTED  DEPTH 
GEOMAGNETIC  ANOMALY 
FREE  AIR  ANOMALY 
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0 

0 


»END 

GOOD  BYE  FOR  NOW. 


TIMES  IN  SECONDS  PAGE  DIRECT 

CPU  ELAPSED  FAULTS  I/O 

2.63  51.28  504  151 


BUFFERED 

I/O 

118 
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A  menu  approach  for  the  user  interface  was  also  considered 
but  not  implemented.  A  high-speed  cathode  ray  tube  type 
terminal  is  essential  for  this  approach  and  I  anticipated  that 
many  users  would  be  annoyed  when  they  had  to  use  slower  speed 
terminals.  On-line  documentation  is  provided  through  the  use 
of  the  HELP  command. 

The  system  has  a  record-keeping  facility  to  keep  use 
statistics.  This  facility  provides  a  mechanism  for  documenting 
the  library  activity  and  is  useful  for  the  following  reasons: 

1.  Unused,  secondary  information  can  be  removed  from 
on-line  storage  to  reduce  storage  costs. 

2.  Charges  based  on  actual  use  can  be  computed. 

3.  Library  growth  and  data  retrieval  patterns  can  be 
studied . 

4.  The  need  for  data  reorganization  can  be 
determined. 

The  interface  to  data  analysis  software,  especially  the 
chart  plotting  programs,  will  be  implemented  using  a  Fortran 
routine  which  allows  another  program  to  retrieve  data  directly 
from  the  library  based  on  geographical  boundaries  and  cruise. 
The  routine  passes  back  to  the  calling  program  one  primary 
record  at  a  time  and  zero  or  more  secondary  data  attributes 
depending  on  what  was  requested. 


The  more  common  retrieval  requests,  such  as  requests  for 
navigation  or  data  charts,  will  be  part  of  the  query  software. 
Users  can  specify  the  chart  bounds,  scale  and  parameters  needed 
by  the  plotting  routines.  These  parameters  are  then 
reformatted  and  sent  to  an  existing  plotting  routine. 

It  is  also  possible  to  produce  copies  of  data  from  the 
library  in  the  exchange  format  (NOAA,  1977)  on  tape.  This 
option  allows  for  simple  data  exchanges  among  other  research 
and  government  organizations  and  also  allows  users  to  have 
their  own  copies  of  the  data. 

Data  integrity  is  maintained  by  the  use  of  parameters 
stored  in  the  dictionary  to  verify  ail  data  before  they  are 
stored  in  the  library.  Recovery  from  software  or  system 
crashes  is  accomplished  by  rolling  back  the  library  to  the 
state  it  was  in  before  the  event.  This  is  accomplished  by 
maintaining  a  copy  of  all  data  base  files  on  magnetic  tapes. 
Data  security  is  provided  by  an  authorization  file  which 
maintains  information  about  valid  users.  Data  insertion, 
modification,  or  deletion  can  only  be  performed  by  authorized 
individuals • 


Chapter  V 
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Conclusion 

I  have  designed  and  implemented  a  storage  and  retrieval 
scheme  which  provides  cost  effective  and  easy  access  to 
location-dependent,  geophysical*  data.  The  scheme  is 
operational  on  a  Digital  Equipment  Corporation  VAX-11/780 
computer.  Information  about  data,  such  as  port  stops,  project 
name  and  funding  agency,  as  well  as  data  values,  are  available, 
on-line,  to  a  time  sharing  user  validated  to  use  the  system. 
Appendix  IV  is  a  user’s  manual  which  describes  how  to  use  this 
system. 

In  order  to  minimize  the  software  development  time  and 
effort,  I  took  advantage  of  the  capabilities  of  the  operating 
system  and  existing  software  whenever  possible.  For  example, 
standard  system  supplied  utilities  are  used  to  back  up  the 
library  on  magnetic  tape  and  existing  graphics  routines  were 
used.  However,  even  with  these  shortcuts,  the  software  system 
takes  more  than  14,000  lines  of  code.  Appendix  V  contains  a 
list  of  system  routines  by  category  and  some  software 
statistics. 
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These  computer  routines  do  not  quite  comprise  a  true  data 
base  management  system  since  they  do  not  provide  users  with 
different  logical  views  of  the  data.  However,  other  features 
of  a  DBMS  are  present:  multi-user  access  to  the  data  is 
provided;  data  requests  are  satisfied  with  reasonable  speed; 
storing  and  retrieving  costs  are  minimized;  and,  data  are 
stored  accurately  and  consistently. 

This  new  data  storage  and  retrieval  system  can  be 
considered  a  success  only  if  it  is  used,  and  if  it  provides  the 
features  needed  by  at  least  a  majority  of  the  users. 
Demonstrations  of  the  system  have  been  well  received.  The  full 
power  of  the  system  will  not  be  realized  until  a  significant 
portion  of  existing  data  are  added.  I  anticipate  that  this 
task  will  be  complete  within  a  few  months. 

Implementation  of  the  system  has  followed  the  original 
design  plans.  Certain  detail  features,  missed  during  the  early 
planning  but  evident  during  implementation  and  testing,  were 
added.  For  example,  the  dictionary  needed  to  store  the 
original  format  of  a  data  parameter,  as  well  as  the  stored 
format,  so  that  values  can  be  converted  back  from  their 
compressed  form  into  their  original  form.  Also,  I  changed  the 
bounds  table  by  including  summary  data  statistics.  This 
feature  was  not  included  in  the  early  design,  but  is  needed  to 
facilitate  user's  requests  about  the  existence  of  data  values. 
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Finally,  implementing  the  data  insertion  routine  was  more 
difficult  than  expected.  Manipulating  the  data  and  bounds 
tables  as  the  data  enters  and  leaves  degree  squares  was  quite 
complex.^  However,  even  with  the  extensive  data  processing, 
the  central  processing  time  needed  to  store  data  in  the  new 
scheme  compares  quite  favorably  to  the  time  to  store  data  in 
the  older,  sequentially  accessed  scheme.  The  improved 
accessibility  to  information  about  the  data  and  cruise  legs,  as 
well  as  the  improved  accessibility  to  the  data  values,  has  been 
achieved. 


Part  of  this  difficulty  was  due  to  the  decision  to  add  to 
the  primary  and  secondary  data  files  sequentially,  rather 
than  randomly.  This  design  feature  minimizes  the  use  of 
disk  space  and  improves  processing  speed  but  is  at  the 
expense  of  programming  complexity. 
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Appendix  I 


Data  Attributes 


ALTITUDE  MAGNETIC  SENSOR 

Position  of  the  primary  magnetic  sensor  below  (-)  or 
above  (+)  sea  level,  in  meters.  Synonym:  DEPTH  MAGNETIC  SENSOR 

BATHYMETRY  CORRECTION  CODE 

Details  the  procedure  used  for  determining  the  sound 
velocity  correction  used  to  correct  the  water  depth 
measurement • 

* 

BATHYMETRY  QUALITY  CODE 

Specifies  the  quality  of  the  recorded  depth  measurements. 

BATHYMETRY  TYPE  CODE 

Specifies  how  the  depth  value  was  derived  (e.g.  observed, 
interpolated,  etc.). 

BOUGUER  ANOMALY 

A  specific  interpretation  of  the  gravity  free-air  anomaly 
taking  the  water  depth  Into  account,  in  mllllgals. 


|  Hacmma  pag*  fid*® 
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CORRECTED  DEPTH 

Water  depth  corrected  for  transducer  depth,  sound  velocity, 
tides,  etc.,  in  meters. 

CRUISE 

Identifies  a  particular  voyage  of  a  ship  in  which  the  ship 
made  one  or  more  port  stops.  Synonym:  cruise  number. 

CRUISE  IDENTIFIER 

Identifies  a  particular  cruise  leg  or  legs  of  a  voyage. 

This  is  usually  unique  within  a  particular  data  gathering 
agency.  Synonym:  a  concatenation  of  SHIP,  CRUISE  and  LEG. 

CUMULATIVE  DISTANCE 

The  distance  traveled  by  the  ship  since  the  start  of  the 
cruise  leg  (usually  with  0  at  the  departing  port)  until  the 
present  time,  in  kilometers. 

CURRENT  HEADING 

The  direction  or  orientation  of  the  water  current,  in 
degrees.  This  value  is  derivable  from  the  CURRENT  SPEED  NORTH 


and  CURRENT  SPEED  EAST. 
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CURRENT  SPEED  EAST 

The  component  of  the  current  velocity  in  the  east 
(positive)  or  west  (negative)  direction,  in  meters  per  second. 

CURRENT  SPEED  NORTH 

The  component  of  the  current  velocity  in  the  north 
(positive)  or  south  (negative)  direction,  in  meters  per 
second . 

CURRENT  VELOCITY 

Vector  specifies  the  speed  and  direction  of  the  current 
through  which  the  ship  is  traveling.  This  value  is  usually 
given  as  two  numbers,  the  north  and  east  components  or  as  the 
current  speed  and  current  heading# 

DATA  RECORD  TYPE 

Under  existing  data  storage  schemes  this  parameter 
identifies  the  type  of  record  in  order  to  distinguish  among 
header  and  data  record  types#  In  a  new  scheme  It  can  have  a 
similar  meaning  and  takes  on  a  more  central  and  important  role 
during  data  retrieval# 

DAY 

Specifies  the  day  of  the  month.  When  combined  with  the 
month,  a  form  of  Julian  day  Is  formed  which  can  be  combined 


with  the  hour.  See  HOUR  DAY 
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DEPTH  MAGNETIC  SENSOR 

See  ALTITUDE  MAGNETIC  SENSOR 

DIURNAL  CORRECTION 

A  correction  applied  to  the  measured  magnetic  field  in 
order  to  correct  for  daily  variation  in  the  regional  field. 

EOTVOS  CORRECTION 

A  correction  applied  to  the  measured  total  gravity  field 
related  to  the  ship’s  speed  and  heading*  in  milligals. 

FREE-AIR  ANOMALY 

The  observed  gravity  minus  the  theoretical  gravity  value, 
in  milligals. 

GEOMAGNETIC  ANOMALY 

The  total  geomagnetic  field  minus  the  theoretical  value  for 
the  total  field.  The  diurnal  correction  has  been  applied  if 
available,  in  gammas.  Synonyms:  magnetic  anomaly,  magnetic 
residual  field,  total  geomagnetic  residual  field. 

GEOMAGNETIC  QUALITY  CODE 

Specifies  the  quality  of  the  recorded  geomagnetic 


measurements 
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GRAVITY  QUALITY  CODE 

Specifies  the  quality  of  the  recorded  gravity  measurement. 
HEIGHT 

The  height  above  sea  level  (plus)  or  below  sea  level 
(negative)  of  the  acquisition  platform.  For  ship  acquisition 
this  is  usually  taken  to  be  0,  in  kilometers. 

HOUR  DAY 

An  encoding  of  the  hour  within  the  day  and  the  day  of  the 
year,  with  January  1st  being  day  l,  defined  as  HOUR  *  1000  + 
day  of  year. 

LEG 

Identifies  the  particular  port-to-port  segment  of  a 
cruise.  Synonym:  leg  number. 

LATITUDE 

The  ship's  position  in  the  north  (positive)  -  south 
(negative)  direction  on  the  earth's  surface,  in  degrees. 

LOCATION 

The  position  of  the  data  collection  platform,  usually 
defined  in  terms  of  latitude,  longitude  and  height. 

Synonym:  position. 
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LONGITUDE 

The  ship's  position  in  the  east  (positive)  -  west 
(negative)  direction  on  the  earth's  surface,  in  degrees. 

MAGNETIC  FIELD  SENSOR 

Specifies  which  recording  sensor  was  used  as  the  primary 
recording  instrument. 

MINUTE 

Specifies  the  minute  and  fraction  of  the  hour,  in  minutes. 

MONTH 

The  month  during  which  a  data  point  was  recorded.  This 
value  is  encoded  along  with  the  day  of  the  month,  into  a  Julian 
day.  Julian  day  is  further  combined  with  the  hour  of  the  day. 
See  HOUR  DAY. 

NAVIGATION  QUALITY  CODE 

Specifies  the  quality  of  the  recorded  navigational 
measurements • 

NAVIGATION  TYPE  CODE 

i 
i 


Indicates  how  the  navigational  information  was  obtained. 
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OBSERVED  GRAVITY 

Measured  total  gravity  measurement,  corrected  for  Eotvos, 
drift  and  tares  in  milligais. 

PROTECTION  CODE 

Specifies  the  access  privilege  required  by  the  user  in 
order  to  retrieve  data. 

SEISMIC  SHOT  POINT 

Identifies  the  shot  points  for  single  and  multichannel 
seismic  data  so  that  this  data  can  be  analyzed  with  other 
underway  measurements. 

SHIP 

Identifies  the  platform  from  which  data  were  collected. 
Synonyms:  ship  name,  cruise  ID,  platform.  Note:  Some 

institutions  concatenate  ship,  cruise,  and  leg  into  a  single 
unique  identifier  and  refer  to  this  entity  as  the  cruise  ID. 

SHIP  HEADING 

The  instantaneous  orientation  of  the  ship,  in  degrees.  See 


SHIP  VELOCITY. 


SHIP  VELOCITY 


The  Instantaneous  ship's  speed  north  (+)/south  (-),  and 
speed  east  (+)/vest  (-)  from  the  current  ship's  position  to  the 
next  measured  instant  of  time,  in  meters  per  second.  See  SPEED 
NORTH  and  SPEED  EAST. 

SPEED  EAST 

The  instantaneous  ship's  speed  in  the  east  (plus)  and  west 
(negative)  direction,  a  component  of  the  ship  velocity,  in 
meters  per  second. 

SPEED  NORTH 

The  instantaneous  ship's  speed  in  the  north  (plus)  and 
south  (negative)  direction,  a  component  of  the  ship  velocity, 
in  meters  per  second. 

TIME 

The  time  of  the  data  measurement (s) ,  in  GMT.  See  HOUR  DAY 
and  MINUTE. 

TIME  ZONE 

Usually  the  time  of  a  data  measurement  is  recorded  in 
Greenwich  Mean  Time  (GMT)  even  though  the  ship  operates  in 
local  time.  Time  zone  is  zero  if  the  recorded  time  is  GMT, 


else  non-zero 
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TOTAL  MAGNETIC  FIELD-1 

The  value  of  the  total  geomagnetic  field  measured  by  the 
primary  sensor,  in  gammas  (or  nanotesla).  Synonyms:  total  mag 
field,  total  mag  field  1,  geomagnetic  field,  total  geomagnetic 
field. 

UNCORRECTED  DEPTH 

The  water  depth,  historically  recorded  in  fathoms  with  an 
assumed  sound  speed  of  800  fat  'oms  per  second  (round  trip 
travel  time),  in  seconds  assuming  sound  speed  of  800  fathoms 
per  second. 

YEAR 

The  year  that  a  measurement  was  collected. 
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Appendix  II 


General  Design  Specifications 


This  section  is  a  copy  of  the  original  system  design 
specifications.  However,  not  all  features  described  here  are 
currently  implemented.  In  particular,  the  following  features 
are  now  available:  inserting  new  data  in  the  library,  data 
summaries  by  area,  and  data  Integrity  features  Including  a 
record  of  transactions  and  verification  of  data  parameters 
before  they  are  stored.  Features  still  under  development 
include  data  value  retrieval  by  cruise  leg  and  geographical 
area  and  the  interface  to  existing  charting  and  profiling 
graphics  routines. 


I.  Loading  the  library. 

A.  Data  insertions:  underway  geophysical  data 
Including  navigation,  bathymetry,  geomagnetics  and  gravity,  are 
supplied  to  the  library  in  the  required  merged  or  merged-merged 
format. 

B.  Data  deletion:  data  in  the  library  which  is  in 
error  can  be  deleted  one  'cruise  leg'  at  a  time. 


|  IM01BUB  PiflS  M iMMWT  FID® 
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C.  Data  update:  a  ’cruise  leg1  whose  navigational 
information  must  be  changed  is  first  deleted  from  storage.  The 
corrected  data  are  then  added  as  a  standard  insert  operation. 

If  individual  measurements  within  a  ’cruise  leg*  are  in  error, 
and  do  not  involve  positional  elements,  then  these  measurements 
can  be  individually  replaced. 

II.  Retrieval. 

A.  Data  retrieval  (to  hard-copy  plots,  listings  or 
magnetic  tape  files)  is  accomplished  by  specifying: 

1.  Specific  cruises. 

2.  Specific  geographical  area. 

3.  Date/ time  pairs. 

4.  Any  combination  of  (1)  through  (3). 

B.  The  system  supports  interrogation  of  the  library  in 
order  to  provide  answers  to  the  following  forms  of  queries: 

1.  How  much  data  exists  in  the  North  Atlantic 
Ocean? 

2.  How  much  data  exists  in  the  geographic  area 
bounded  by  10  degrees  north,  5  degrees  south, 
33  degrees  west  and  5  degrees  east? 

3.  Does  data  exist  for  ATLANTIS  II  cruise  17 


leg  6? 
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4.  Display  on  my  CRT  screen  the  ship's  tracks 
located  in  the  following  area:  25N,  20N,  30E, 
and  5E,  collected  since  1975. 

5.  Contour  the  'bathymetry*  data  located  in  the  2 
degree  square  surrounding  the  point  30  degrees 
south,  5  degrees  east;  and  display  the  contour 
plot  on  my  CRT  screen. 

C.  The  system  supports  the  following  forms  of 
graphical  display  options: 

1.  Annotated  charts  displaying  cruise  tracks  and 
data  values  along  the  ship's  track. 

2.  Profiles  of  selected  cruise  leg  data  versus 
time  or  cumulative  distance. 

3.  Profiles  of  selected  cruise  leg  data, 
plotted  at  right  angles  to  the  cruise  track. 

4.  Contour  plots  of  any  data  parameter. 

III.  Data  integrity. 

A.  A  chronological  record  is  kept  of  all  transactions  with 
the  library  involving  insertion,  deletion  or  replacement, 
including  the  number  and  kinds  of  records  involved. 

B.  The  ability  to  recover  from  catastrophic  failures  such 
as  power  loss  during  the  insertion/deletion/ replacement 
operation  is  provided. 
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IV.  Operation,  maintenance  and  cost  considerations. 

A.  Information  is  maintained  by  the  system  so  that  each 
user  of  the  library  can  be  charged  according  to  the  use  of  the 
system. 

B.  An  activity  summary  is  kept  of  the  retrieval  use  of  the 
system,  including  the  number  of  users,  the  types  of  requests, 
the  number  of  records  retrieved,  and  the  computer  resources 
used,  such  as  elapsed  and  cpu  time. 

C.  In  order  to  support  full  on-line  access  to  the  entire 
library  a  significant  portion  of  a  dedicated  disk  pack  and 


drive  is  required 


61. 


Appendix  III 


Program  Structure  Diagrams 

This  section  displays  tree  structured  program  organization 
diagrams  for  three  important  functions  in  the  library  system: 
dictionary  update,  data  Insertion  and  user  query. 


I 

f 

j 


Dictionary  Update 


Update 

Dictionary 
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Data  Insertion 
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Appendix  IV 


User's  Manual 


This  appendix  is  the  user's  manual  for  DBQUERY,  the  data 
retrieval  program  for  position-dependent  data.  The  manual  describes 
how  a  user  at  a  time-sharing  terminal  connected  to  Woods  Hole 
Oceanographic  Institution's  VAX-11/780  computer  can  access 
information  about  'geophysical*  data  stored  by  the  computer. 
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I .  Capabi  lit ies 

The  program  used  to  access  information  in  the  library  for 
location-dependent  data  is  called  DBQUERY.  This  program  allows  the 
user  to  obtain  information  about  'cruise  legs',  retrieve  data  from 
the  library,  generate  plots,  and  retrieve  information  about  the 
parameters  stored  in  the  library* 

II.  Logging  On 

The  first  step  that  a  library  user  must  take  is  to  connect  the 
time-sharing  terminal  to  the  Woods  Hole  Oceanographic  Institution's 
VAX-11  computer.  If  the  user  is  outside  of  Woods  Hole,  then  dial 
(617)  540-6000  and  ask  the  operator  for  one  of  the  following  VAX-11 
computer  extensions  depending  on  the  transmission  speed  you  require: 

Extension  Number  Baud  Rate  of  User's  Terminal 

6600  300  band 

6500  1200  band 

If  you  have  access  to  a  W.H.O.I.  "blue  phone"  you  can  dial  one  of 
these  extensions  yourself.  (Some  terminals  &t  W.H.O.I.  are 
permanently  connected  to  the  VAX  and  do  not  require  a  phone 
connection).  Place  your  phone  in  the  modem  cradle  when  you  hear  the 
high  pitched  "carrier"  tone. 


i 
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At  this  point,  touch  the  "return"  key  at  your  terminal.  The 
computer  will  respond  with 
Username: 

Enter  your  VAX  user  name  followed  by  a  return.  If  you  do  not 
have  a  user  name  then  you  must  call  the  computer  center  and  request 
one;  after  a  valid  user  name  has  been  entered  the  computer  will  print 
Password : 

Enter  your  password  followed  by  a  return.  If  you  have  entered 
these  items  correctly,  the  system  will  respond  with 
Welcome  to  VAX/VMS  Version  1.60  (date) 

III.  Program  DBQUERY  Initiation 

Once  you  have  successfully  logged  onto  the  VAX  and  receive  the 
dollar  sign  ($)  prompt  character  you  can  begin  using  the  query 
program  by  typing  the  following  command: 

RUN  DBAO: [GNG] DBQUERY 

Program  DBQUERY  will  begin  and  print  the  following 
DBQUERY  -  Version  1.00  -  (today’s  date) 

Hi’.  With  this  program  you  can  request  information  from  the 
dictionary  and  library  files.  If  you  need  assistance  with  any  of  the 
commands,  just  type  HELP  for  more  information. 
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The  user  prompt  for  program  DBQUERY  is  11  That  is,  when  the 

query  program  is  ready  to  receive  your  next  command  it  will  display 
these  characters.  The  following  section  describes  the  commands  you 
can  use.  However,  if  you  have  not  been  authorized  to  use  the  library 
system  you  will  get  the  following  message: 

*DBQUERY  message,  could  not  open  the  library  for  retrieval.  Sorry! 

If  you  get  this  message  you  cannot  use  the  library.  Contact 
Woods  Hole’s  Digital  Data  Library  staff  at  548-1400  Ext.  2581  for 
authorization  or  if  you  have  any  other  questions. 

IV.  User  Commands 

There  are  thirteen  basic  commands  that  you  can  issue  to  program 
DBQUERY.  These  are 


COPY 

HELP 

DEFINE 

LIST 

DISPLAY 

PLOT 

DONE 

SET 

DRAW 

SHOW 

END 

STOP 

Commands  take  the  following  general  form 


command  verb! /option 


i  i 

.  . 

lobject  ! 

I  9 

•  • 
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where  'command  verb'  is  any  one  of  the  above  commands;  '/option'  is 
zero,  one  or  more  command  verb  options  separated  by  a  slash;  and 
'object*  is  zero,  one  or  more  parameters  used  by  the  command, 
separated  by  a  comma*  Command  verbs  and  options  can  be  abbreviated 
to  as  few  as  two  letters  (e.g.  CO  for  copy,  HE  for  help  and  ST  for 
stop) . 

A  description  of  each  of  the  basic  comannds  follows.  They  are 
listed  in  alphabetical  order* 

3  command; 

This  command  allows  you  to  execute  one  or  more  library  commands 
that  have  been  stored  in  a  disk  file.  The  file  must  already 
exist*  At  the  present  time  all  commands  will  be  honored  except 
another  @  command* 

The  form  of  the  command  is 

@  filespec 

where  'filespec'  is  the  name  of  a  VAX/VMS  file  created  either 
with  the  editor  or  other  compatible  utilities* 
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OPTIONS: 


/ECHO  print  the  commands  read  from  filespec  on  the 
user's  terminal 


COPY  command: 

This  command  allows  you  to  copy  selected  portions  of  the  library 
onto  a  disk  file.  The  specific  data  that  you  wish  to  copy  are 
selected  using  the  SET  command. 

The  form  of  the  command  is 

COPY 


The  default  name  of  the  file  created  is  made  up  of  the  first  8 
nonblank  characters  of  your  logon  name  with  an  extent  of  .DAT* 
Also,  the  SET  command  allows  you  to  change  the  default  output 


file  name 
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DEFINE  command: 

This  command  allows  you  to  retrieve  dictionary  term  definitions 
and  other  information  from  the  dictionary*  The  form  of  the 
command  is 


DEFINE/option  [ terml] , [term2 ] , . . . 

where  [terml],  [term2],  etc.  are  terms  that  you  wish  to  have 
defined.  If  the  terms  are  not  in  the  dictionary  a  message  will 
indicate  that. 

OPTIONS : 

/ALL  generates  an  alphabetical  list  of  all  dictionary 

terms 

/PARTIAL  display  definition  plus  additional  information 

about  where  and  how  the  parameter  is  stored  in  the 
library 

/WORD  display  only  the  definition  (default) 
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Example 

DEFINE/ PARTIAL  FREE  AIR  ANOMALY 
FREE  AIR  ANOMALY: 

observed  gravity,  corrected  for  Eotvos  and  drift,  minus  the 
theoretical  gravity  value 

ID  //:  29  Data  file  type:  secondary 

Starting  byte  position  In  group  #  2  Is  15  (Bit  position:  0) 

Original  variable  type:  real 
Variable  stored  as  type:  Integer*  2 


DISPLAY  command: 


This  command  prints  a  summary  of  the  data  which  satisfy  the 
current  retrieval  request  as  defined  by  the  SET  command*  The 
form  of  the  command  Is 
DISPLAY /opt Ion 

Options : 

/CRUISEJLEGS  lists  the  cruise  legs  with  a  summary  of  the  number 
of  measurements  satisfying  your  retrieval  sped* 
f lcatlons* 

/DATA_SUMMARY  lists  the  total  number  of  measurements  satifying 


your  bounds  and  data  type  retrieval 
specifications.  (Default)  > 


Example 


DISPLAY 

*PRTDATSUM  message,  summary  of  -21 


data  available  in  the  area 
indicated  to  the  right. 


Parameter  name 

POSITION(S) 
CORRECTED  DEPTH 
GEOMAGNETIC  ANOMALY 
FREE  AIR  ANOMALY 


»  » 

.  . 

14  1  !  20 

•  I 

•  • 


-25 

Number  of  data  points 

44 

44 

0 

0 


DONE  command: 

This  command  stops  the  DBQUERY  program  and  returns  you  to  the 
VAX/VMS  monitor. 


DRAW  command: 

This  command  allows  you  to  generate  charts  and  profiles  of  data 
at  your  CRT  screen,  in  real  time.  However,  at  this  time  the 
option  has  not  been  implemented. 
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END  command : 


This  command  stops  the  DBQUERY  program  and  returns  you  to  the 
VAX/ VMS  monitor. 


Example 

END 

GOOD  BYE  FOR  NOW. 

DIRECT  BUFFERED 

I/O  I/O 

143  76 


TIMES  IN  SECONDS  PAGE 
CPU  ELAPSED  FAULTS 

1.74  162.17  159 


HELP  command : 


The  following  library  query  commands  are  recognized: 


COPY  DEFINE  DISPLAY  DONE  DRAW  END  LIST  PLOT  SET 
SHOW  STOP  @ 


For  more  information  about  these  commands  enter 


HELP  [command] 
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where  [command]  is  any  one  of  the  above  commands.  Unique 
abbreviations  for  the  commands  and  options,  down  to  two  letters, 
are  allowed.  For  example,  to  obtain  more  information  on  the 
DISPLAY  command  any  one  of  the  following  lines  are  allowed: 

HELP  DISPLAY 
HE  DISP 


LIST  command: 

This  command  allows  you  to  retrieve  information  about  one  or 
more  cruises  from  the  library  ship  table.  The  form  of  the 
command  is 

LIST/option  cruise_leg_JL, cruise_leg_2, . . . 
where  cruise_leg__l,  crulse_leg__2  etc.  are  the  names  of  the 
cruise  legs. 

OPTIONS: 

/ALL  produces  an  alphabetical  list  of  all  the  ’cruise 

legs’  available  in  the  library 
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/BRIEF  prints  some  information  about  the  'cruise 

legs'  listed  in  the  command,  such  as  l he 
start  and  end  dates,  project  name,  port 
stops,  chief  scientist  and  contributor. 
(Default ) 

/FULL  prints  all  the  information  available  about 

the  'cruise  legs'  listed  in  the  command. 

/MGD77  prints  the  MGD77  header  records  of  the 

'cruise  legs'  listed  in  the  command 


Example 

LIST/FULL  ATLANTIS  II 

*OTPTSTBLF  message,  full  contents  of  ship  table  entry  follows: 

Cruise  leg  id:  ATLANTIS  II  67  5 

Start  date:  1972/  4/12  0.00  End  date:  1972/  5/  6  0.00 

Date  added:  1980/  5/13  1617.54 

Flags:  0  Security  code:  0  Reference  //:  151  Reserved:  0 

Project  name:  ID0E 
Chief  scientist:  MILLIMAN 
Port  stops:  WALVIS  BAY-WALVIS  BAY 
Contributor:  Woods  Hole  Oceanographic  Institution 
Comments:  SE  ATLANTIC  OCEAN 

Parameter  name  Number  of  Data  Points 


CORRECTED  DEPTH 
POSITION(S) 


1310 

1310 
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PLOT  command: 

This  command  allows  you  to  generate  hard  copy  plots  of  charts, 
profiles  and  contour  maps*  Currently,  however,  this  option  is 
not  implemented. 


SET  command: 

This  command  allows  you  to  restrict  or  further  define  your  data 
retrieval  request*  The  form  of  the  command  is 

SET/option  parameter^,  parameter_2  , .  • . 

Each  option  calls  for  a  different  number  of  parameters.  Consult 
the  options  list  below  to  determine  the  form  and  number  of 
parameters  to  include. 

Examp le 


SET/BOUNDS  -21,-25,14,20 


Options : 


/BOUNDS 


/COPY  OUTPUT 


restricts  the  retrieval  boundaries  to  the 
specified  top,  bottom,  left  and  right  sides  of  a 
geographical  rectangle.  The  default  is  the  whole 
world.  For  example,  to  retrieve  data  which  lie 
only  within  the  area  bounded  by  10  degrees  north, 
30  degrees  south,  14  degrees  west  and  5  degrees 
east  enter 


SET /BOUNDS  10,-30,-14,5 

Note  that  the  sign  convention  is  south  and  west 
negative.  (Default  is  the  entire  world.) 

change  the  output  file  or  device  name  for  the 
COPY  command  to  the  specified  file  name.  The 
default  file  name  is  made  up  of  the  first  8 
non-blank  letters  of  the  user’s  name  with  the 
extent  of  .DAT.  For  example,  to  change  the  file 
name  to  MYDATA.DAT  enter 


SET/ COPY  OUTPUT  MYDATA.DAT 
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To  return  to  the  default  output  file  name  enter 
SET/COPYJ)UTPUT 

/CRUISE_LEG  restricts  the  retrieval  to  the  cruise  legs 
specified  in  the  rest  of  the  command.  The 
default  is  ALL  of  the  cruise  legs.  For  example, 
to  restrict  retrieval  to  data  from  CHAIN  115  leg 
3  and  ATLANTIS  II  67  leg  2  enter 

SET/CRUISE_LEG  CHAIN  115  3, ATLANTIS  II  67  2 

/DATATYPE  restricts  retrieval  to  the  data  types  specified. 
For  example 

SET/DATA_TYPE  BOUGUER  ANOMALY 
will  result  in  retrieving  the  Bouguer  anomaly 
data  parameter.  The  default  data  types  are 
corrected  depth,  geomagnetic  anomaly  and  free  air 
anomaly.  To  return  to  the  default  data  types 
enter 


SET 

To  retrieve  all  data,  enter 


SET/DATA  TYPE  ALL 
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SHOW  command : 

This  command  allows  you  to  display  the  current  values  for 
retrieval  options  selected  either  by  default  or  explicitly 
through  the  use  of  the  SET  command.  The  form  of  the  command  is 
SHOW/option 

Option 

/BOUNDS 

/  COP  Y_0UTPUT 

/  CRUISE  JLEG 

/DATAJTYPE 

STOP  command: 

This  command  stops  the  DBQUERY  program  and  returns  you  to  the 
VAX/ VMS  monitor. 


describes  the  geographical  retrieval 
bounda  ry .  (Default ) 

gives  the  name  of  the  disk  file  or  device  which 
will  receive  the  output  from  a  COPY  command 
lists  the  ’cruise  legs’  from  which  retrieval  is 
restricted 

lists  the  data  types  that  will  be  retrieved 
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Appendix  V 

Computer  Routine  Descriptions 


This  appendix  provides  a  summary  listing  by  category  of  the 
computer  routines  which  I  designed  and  wrote  to  implement  the  data 
library*  There  currently  are  a  total  of  314  routines  which  make  up 
the  software  for  the  data  library  system.  With  a  total  of  14232 
lines  of  code,  this  yields  an  average  of  about  45  lines  per 
routine.  Sixty-six  percent  of  the  lines  are  executable  code  while 
the  rest  are  comment  (documentation)  lines. 

Main  Programs 

DBINSERT  —  insert  new  data  for  an  entire  cruise  leg 
DBQUERY  —  library  query  and  retrieval  program 
DBVALID  —  validate  a  new  user  for  the  library  system 

INITDBLD  —  initialize  the  data  dictionary,  history  file, 
validation  file  and  ship  table 

SHPTBLINS  —  add  new  cruise  leg  information  to  the  ship  table 
UPDBDIC  —  update  the  library  dictionary  with  a  new  data  attribute 
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Input /Output  routines 


CLSDTLAPN 

OPNRWPRI 

RDSTBL5 

CLSPRIAPN 

OPNRWSTBL 

RDSTBL7 

DECLOSE 

OPNWVLDE 

RDSTBLCTS 

DPOPNEXEC 

OTPTDBMIS 

RDSTBLDT 

DBOPNRTV 

OTPTSTBLF 

RDVLDF 

DBOPNUPD 

PRTCNTS 

RE  ADD  ICR 

GETDDLNAV 

PRTSTBL 

READFULLT 

GETNEWNAV 

PRTTRMALL 

READSEAG 

GETNEWREL 

PRTTRMDEF 

READSTBLR 

GETUSRCMD 

PRTTRMPAR 

READSTSO 

INPTATBTS 

PUTNEWDAT 

RPBNDL 

INPTINTP 

PUTNEWDTL 

RSTBLPAR 

INPTSYSP 

RDBND2 

WRTDBHIS 

INPTUSRP 

RDBNDPAR 

WRTDIC2 

INSTODB 

RDOIC2 

WRTDIC23 

OPNDBHIS 

RDDIC23 

WRTDIC3 

OPNDTLAPN 

RDOIC5 

WRTDIC5 

OPNPRIAPN 

RDOICDEF 

WRIDICr 

OPNRDIC 

RDICPARAM 

WRTINTDT6 

OPNRPRI 

RDSTBL11 

WRTFYKKT 

OPNRSTBL 

RDSTBL13 

WRTSTBL11 

OPNRWBND 

RDSTBL19 

WRTSTBL17 

OPNRDDIC 

RDSTBL2 

WRTSTBL19 

OPNRDDTL 


RDSTBL3 


WRTSTBL2 


Input/Output  routines  (continued) 


WRTSTBL3 

WRTSTBL5 

WRTSTBL7 

WRTSTBLDT 

WRTSTBLR 

WRTSYSP 

WSTBLPAR 

WTBND2 

WTBND3 

WTBNDPAR 

WTDICDEF 

WTDI CUNTS 

WTNEWDTL 

WTNEWNAV 

WTSDTLR2 

WTSPRIR1 

WTSPRIR2 

WTSPRIR3 

WTVLDF 


tf  if 
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Specific  functions 


ADDCNTBND 

DBQRYSET 

EXPSET 

ADDCNTPRI 

DQBRYSHOW 

EXPSHOW 

ADDCRUISE 

DEFBNDFIL 

EXPSTOP 

ADDINTOIC 

DEFDBBND 

FLODTOPOS 

ADDNEWUSR 

DEFBDDIC 

FNDOICTRM 

ADDSTBL 

DEFDBDTL 

FMDSTBL1D 

ADDTODIC 

DEFDBHELP 

GETDICOEF 

ADDTOSTBL 

DEFDBHIS 

GETINTPAR 

ALPHAPOS 

DEFDBPRI 

GETNEWBIT 

ALPHASTBL 

DEFDBVAL 

GETNEWCHR 

CHKUSREXE 

DEFDTLFIL 

GETNEWCPX 

CHKUSRUPI 

DEFNULVAL 

GETNEWDBL 

CODTOAFIL 

DEFPRIFIL 

GETNEWINT 

COOTOPFIL 

DEFSTBL 

GETSTBLE 

CPOSBNDE 

DEFSYSP 

GETVARID 

DBACTION 

DEFVARTYP 

INITOBHIS 

DBCMDOP 

EXPATFL 

INITOIC 

DBMENUOPP 

EXPDEFN 

INIT  STBL 

DBQRYATEL 

EXPDONE 

INIT  VLOF 

DBQRYDEFN 

EXPDRAW 

MAKBNDFIL 

DBQRYDRAW 

EXPOSPY 

MAKDTLFIL 

DBQRYDSPY 

EXPEND 

MAKPRIFIL 

DBQRYHELKP 


EXPHFUCP 


MVJINFSTG 


"1 
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Specific  functions  (continued) 
DBQRYLIST 
EXPLIST 
POSTDAFIL 
DBQRYPLOT 
EXPPLOT 
POSTOFCOD 
POSTOFPFIL 
SETQRYDFT 
UPDDBDIC 
UPDBNDR2 
UPDBNDR3 
UPDDBSET 
UPDSTBL 
UP DTPNTRS 
UPTALPHA 
UPTSTBLA 
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Utility  routines 


ALIMSSRNK 

FNDSTRING 

OBSGXPND 

ALIMSXPND 

GANOMSRNK 

OPENERR 

BLNKFILE 

GANOMXPND 

OPNSEQR 

CDISTRSNK 

GETCMPDAT 

POSSANK 

CDISTXPND 

GETJINF 

POSXPND 

CHGNVMTIM 

GETNXTPNO 

PRSSWTCH 

CLOSEFILE 

GETUSRNAME 

PRSVERB 

CLOSERR 

HASHTEXTADDRESS 

READBTXT 

COUNTERS 

HSEARCH 

READERR 

DATESRNK 

IDTOMULT 

RECNOTFND 

DATEXPND 

INPTTXT 

SRNKR4I2 

DATBITSAM 

LISTFILE 

SRNKR4I4 

DELAY 

LREADERR 

SUBQUAD 

DEPTHSRNK 

LSTNONBLK 

TERMDESC 

DEPTHXPND 

LSTSTRING 

TESTLL 

DREADERR 

LWRITERR 

TMAGSRNK 

DRNLCSRNK 

MAXELL 

TMAGXPND 

DRNLCXPND 

MANOMSRNK 

TZONESANK 

DWRITERR 

MANOMXPND 

TZONEXPND 

EOFFOUND 

MTNUMSRNK 

UNXEDEOIF 

ERRMESS 

MTNUMXPND 

UPDOATCNT 

EOTVOSSRNK 

NEXTPRIME 

UPLSTDBR 

EOTVOSXPND 

NOROOM 

VELSRNK 

Utility  routines  (continued) 


FNDDBREC 

NXTBLK 

VELXPND 

FNDLSTREC 

NXTNONBLD 

WRITERR 

FNDNXTFRE 

OBSGSRNK 

WRTDBTXT 

XPNDI2I4 

XPNDI4R4 

XTRCTOBJ 

2COORSRNK 


ZCOORXPND 
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DICTIONARY  STRUCTURE 


Revision  Date:  11  August  1981 

The  dictionary  is  maintained  in  two  distinct  parts  of  a  direct  access 
file.  In  the  first  part  entries  are  inserted  and  retrieved  via  a 
straightforward  hashing  algorithm.  Primary  area  overflow  technique  is  used 
to  resolve  collisions.  Each  48  byte  record  (except  record  1)  in  this  section 
is  organized  as  follows: 


Unique 

» 

34  byte  l  Local 

T -  ,f 

«  , 

Local  !  32  bits  for  : 

integer*2 

word  I  integer*4  I 

integer*4  ! 

flags*  l 

term  id 

entry  I  pointer  to 

pointer  to 

Bit  1 

number 

!  next  record 

first  record 

for  synonym 

i  in 

l  alphabetical 

J  sequence 

» 

of  entry 
attributes 
(details 
pointer) 

entry 

Record  1  in  the  dictionary  maintains  the  following  information: 


First 

Modulo  value 

1 

• 

! Local 

Integer*2 

Next  free 

Next 

integer*4 

for  hashing 

I integer*4 

of  next 

record  in 

term  id  // 

pointer 

scheme  (equal 

.'pointer 

number 

second 

available 

lor 

to  size  of 

l  into 

available 

section  of 

for  use 

alphabet!- 

first  part 

! details 

as  data 

the 

lor  word 

cally 

of  dictionary) 

'.section  for 

group 

dictionary 

entry 

sequenced 

integer*4 

I  system 

numbers 

integer*4 

linked 

Ispecif ications 

integer 

list 

land  other 

*2 

l  info 

t 

» 

The  direct  access  part  of  the  dictionary  containing  the  dictionary  details 
contains  the  following  kinds  of  records: 

Record  Type  1  -  Library  System  parameters  (subtypes  1,  2  and  3) 


Record 

Next 

Record 

Record 

Network 

Disk 

Directory 

type 

local 

type 

type 

name 

name 

name 

Un- 

=“  1 

pointer 

of  next 

subcode 

for 

for 

for 

used 

integer 

integer 

record 

*  1 

data 

data 

data 

2 

*2 

*4 

in  list 

integer 

14 

8 

14 

bytes 

integer 

*2 

*2 

bytes 

bytes 

bytes 
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Record 

— 

Next 

-  - 

Record 

— 

Record 

Network 

- V 

• 

Disk 

type 

local 

type 

type 

name 

name 

=  1 

pointer 

of  next 

subcode 

for 

for 

integer 

integer 

record 

-2 

bounds 

bounds 

*2 

*4 

in  list 

integer 

tables 

tables 

integer 

*2 

14  bytes 

8  bytes 

*2 

Di rectory 

name 

for 

bounds 
tables 
14  bytes 


Un¬ 

used 

2 

bytes 


T 


Record 

type 

=1 

integer 

*2 


Next 

local 

pointer 

integer 

*4 


Record 
type 
of  next 
record 
in  list 
integer 
*2 


Record 

type 

subcode 

-3 

integer 

*2 


Network 

name 

for 

ship 

table 

14  bytes 


Disk 

name 

for 

ship 

table 

8  bytes 


Directory 

name 

for 

ship 

table 

14  bytes 


Un¬ 

used 

2 

bytes 


Record  type  2  -  Parameter  location  in  data  base 


Record 

Next 

Record 

Owner 

Data  base 

Data  base 

type 

local 

type  of 

pointer 

file  type 

group 

2 

pointer 

next 

into  first 

containing 

record  number 

integer 

for  entry 

record 

part  of 

this  data 

containing 

*2 

integer 

in  list 

dictionary 

parameter 

this  data 

\ 

*4 

integer 

*2 

file 

integer*4 

l^primary 

2*secondary 

integer 

*1 

parameter 

integer 

*2 

Starting 

Length 

Variable 

Bit  posi- 

Original 

Date 

byte 

in 

type 

tion 

variable 

added 

number 

bytes 

code  as 

starting 

type  code 

5 

in 

stored  in 

from 

before 

bytes 

data  base 

integer  ! 

data  base 

right 

conversion 

integer 

*1  ; 

integer 

integer 

integer 

*1 

\ 

*2 

*1 

*2 

Record  type  3  -  English  definition  (units  specification  follows  definition) 


I  » 

•  • 

Record  I  Next  .'Record 

Subcode 

type  Ilocal  -type 

integer*2 

3  .'pointer  !of  next 

definition  of  entry 

integer  i integer  Jrecord 

continued  as 

*2  !  *4  .’in  list 

needed 

I  integer 

38  characters 

1  :  *2 

t  f 

Record  type  5  -  Units  specification  (follows  definition;  last  entry) 


Record 

Next 

Record 

— 

Subcode 

— 

Units  of  entry 

type  5 

local 

type  of 

integer*2 

in  text  form 

integer 

pointer 

next 

(continued  if  needed) 

*2 

*4 

record 
in  list 
integer 
*2 

38  characters 

Record  type  7  -  Pointer  variables 


Record 

— 

Next 

- - 

Record 

Subcode 

type  7 

local 

type  of 

integer*2 

Reserved  for  pointer 

intege  r 

pointer 

next 

type  variable 

*2 

integer 

record 

38  bytes 

*4  ! 

\ 

i 

in  list 
integer 
*2 

i 
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Record  type  li 


Relationship/set  or  table  name  variable 


Record 

Next 

Record 

Subcode 

type 

local 

type  of 

integer*2 

Reserved  for  relationship/ 

11 

pointer 

next 

set/ 

integer 

integer 

record 

table  name/type  variable 

*2 

*4 

in  list 
integer*2 

38  bytes 

Record  numbers  between  23  and  43  specify  an  offset  and  range  (low  and 
high  values)  for  each  data  type  as  appropriate. 


Record  type  23  -  Integer  *1,  *2,  or  *4  type  variable 


Record 

Next 

Record 

Mult. 

Offset 

Lowest 

Highest 

type 

local 

type  of 

constant 

value 

valid 

valid 

Unused 

23 

pointer 

next 

integer 

integer 

value 

value 

27 

integer 

*2 

integer 

*4 

record 
in  list 
integer 
*2 

*1 

*4 

(used  as 
invalid 
value) 
integer 
*4 

integer 

*4 

bytes 

Recoru  type  29  -  Real  *4  type  variables 


Record 

Next 

— 

Record 

— 

Mult. 

Offset 

— 

Lowe  s  t 

Highest 

type 

local 

type  of 

constant 

value 

valid 

valid 

Un- 

29 

pointer 

next 

integer 

Real 

value 

value 

used 

integer 

*2 

integer 

*4 

\ 

record 
in  list 
integer 
*2 

*1 

*4 

(used  as 
invalid  i 
value)  1 
Real*4  ! 

i 

Real*4 

27 

bytes 
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Record  type  31 


Character  type  variable 


Record 

Next 

Record 

type 

local 

type  of 

31 

pointer 

next 

Not  specified  yet 

in tege r 

integer 

record 

*2 

*4 

in  list 
integer 
*2 

Record  type  37  -  Bit  data  type 


kecora 

Next 

Record 

type 

local 

type  of 

J7 

pointer 

next 

Not  specified  yet 

integer 

integer 

record 

*2 

*4 

in  list 
integer 
*2 

Record  type  41  -  Complex  data  type 


Record 

Next 

Record 

type 

local 

type  of 

41 

pointer 

next 

Not  specified  yet 

integer 

integer 

record 

*2 

*4 

iin  list 
!  integer 

1  *2 
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Kecorei  type  4j 


-  Double  precision  data  type 


Record 

Next 

Record 

i  y  pc 

Local 

type  of 

4  J 

pointer 

next 

Not  specified  yet 

in le^e  r 

integer 

record 

*2 

*4 

in  list 

integer 

*2 

In  all  cases,  a  value  of  0  for  a  next  pointer  indicates  that  the  linked 
list  has  terminated. 

The  name  for  the  dictionary  file  is 

DBLD.DIC 

from  Data  Base  for  Location-Dependent  data. 


DATA  STRUCTURE 


Revision  Date:  23  July  1981 

The  data  are  organized  into  three  categories:  primary  data, 
secondary  data,  and  pointers  and  file  specification  data. 


Primary  Data 

ihe  primary  data  include  the  date,  time,  and  position  information 
^see  figure  1).  Ttiese  data  are  organized  in  direct  access  files  according 
to  the  particular  one  degree  geographical  square  in  which  the  data 
resides.  This  approach  is  taken  in  a  data  base  optimized  for  location 
dependent  data  as  it  provides: 

1.  growth  potential  for  other  than  geophysical  data  types, 

2.  retaining  the  ability  to  reduce  the  on-line  size  of  the 
data  base  while  affecting  as  few  users  as  possible, 

3.  allowing  the  data  base  to  store  data  in  an  economical 
manner  -  without  resorting  to  extraordinary  pointer 
(find  next  record)  schemes, 

4.  simplicity. 

One  implementat ion  of  this  scheme  could  require  as  many  as  64800 
tiles  to  be  supported  if  data  were  stored  in  all  degree  squares  of  the 
world.  However,  specific  applications  usually  leave  a  significant  fraction 
of  the  world’s  surface  unsampled.  In  particular,  the  geophysical  data  base 
implemented  here  would  need  less  than  50,000  position  files  even  when  all 
data  are  included  in  the  data  base.  However,  this  is  still  too  many  files 
for  tne  operating  system  to  maintain,  so  an  alternate  approach  is  taken. 
Only  four  files  are  used  to  cover  the  world,  one  file  for  each  quadrant  of 
ttie  earth  (that  is,  north-east,  north-west,  south-east  and  south-west). 
Within  each  quadrant  file,  data  can  still  be  accessed  by  degree  square 
uecause  of  the  bounds  table  (see  next  section).  This  scheme  retains  the 
desired  retrieval  capabilities  without  undue  file  manipulation* 

ihe  fixe  naming  convention  for  primary  data  uses  the  format 
(NORTH/ SOUTH ) ( EAST/WEST ) . DBl 


Each  primary  record  is  associated  with  (that  is,  points  to)  at  least 
one  pointer-type  record,  also  contained  in  the  position  file.  A  pointer 
record  contains  a  "next  record"  pointer  for  each  relationship  in  which  the 
position  record  participates.  The  pointer  record  contains  a  record  number 
and  a  code  for  the  file  name.  Since  the  location  of  the  stored  data 
depends  on  its  geographical  position  it  seems  logical  that  this  code  depend 
on  the  latitude  and  longitude  values. 
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In  order  to  speed  up  searching  and  responses  to  queries  on  availability 
of  data  within  specific  degree  squares,  the  number  of  points  within  a  degree 
square  for  each  cruise  leg  is  maintained  in  the  bounds  table  (see  the 
following  section).  A  pointer  to  the  last  record  in  the  degree  square  for 
ouch  cruise  leg  (included  once  whenever  a  cruise  leg  enters  or  reenters  a 
degree  square)  permits  faster  navigating  through  the  data  base. 

This  approach  offers  a  good  deal  of  growth  potential  since  a  position 
record  can  participate  in  an  essentially  unlimited  number  of  relationships 
without  compromising  the  pointer  structure.  Also,  it  offers  the  opportunity 
to  restructure  the  ordering  of  the  pointer-type  records  (where  there  are  more 
tlian  one)  to  improve  retrieval. 

Since  a  sequential  write  operation  is  less  costly  than  the  direct 
access  write  we  provide  a  means  to  add  new  data  to  these  position  files  using 
the  sequential  write  operation.  This  method  also  minimizes  the  file  sizes  by 
allowing  files  to  be  only  as  long  as  they  need  be,  and  without  having  to 
prealiocate  file  space.  The  first  record  in  each  position  file  maintains 
iniormation  needed  to  maintain  the  link  list  structure  of  the  file  and 
supports  tills  scheme.  The  format  of  the  first  record  is  as  follows. 


Record 

First  free 

Creation 

Date  last 

type 

record  (equals 

date 

updated 

*0 

record  number 

year 

year 

intege  r 

of  the  last  record 

hour , day 

hour , day 

*2 

in  the  file  plus 

minutes 

minute 

one) 

5  bytes 

5  bytes 

integer*4 

Secondary  Data 

The  pointer  record  includes  a  pointer  into  a  secondary  data  file.  That 
is,  any  parameters  (the  secondary  data)  associated  with  the  position  are 
stored  j.n  a  separate  file.  There  are  a  number  of  advantages  to  this  approach. 

1.  Adding  a  new  data  type  does  not  affect  the  organization 
of  the  primary  data  and  allows  for  a  great  deal  of  growth 
potential. 

i.  Secondary  data  can  be  removed  entirely  from  the  disk 
without  affecting  queries  based  on  position. 


Page  10 


3.  Selected  portions  of  secondary  data  can  easily  migrate  on 
and  off  the  disk  as  the  needs  for  these  data  change. 

4.  The  celj.  lengths  in  the  primary  and  secondary  files  need 
not  be  the  same,  making  better  use  of  disk  space* 

The  file  naming  convention  for  the  secondary  data  is 
( NORTH/ SOUTH ) ( EAST/WEST ) . DB2 

At  the  present  time  one  attribute  file  name  at  most  will  exist  for  each 
positional  file  name .  However,  one  direction  of  growth  is  to  divide 
different  attributes  into  different  attribute  file  names. 

The  attribute  (or  secondary)  file  is  maintained  as  a  multi-list 
structure,  with  linked  lists  of  attributes,  one  linked  list  for  each  parent 
owner)  position  entry  in  the  positional  data  file.  Each  linked  list  is 
made  up  of  storage  cells  of  disk  space  containing  information  for  one  group 
of  data  attributes  and  a  next  pointer.  The  purpose  of  the  group  concept  is 
to  improve  retrieval  times  for  data  whL,  h  are  usually  used  together. 

Figure  2  contains  the  groupings  for  knowt  data  parameters.  Note  that  the 
pointer  record  is  located  in  the  primary  file  rather  than  in  the  secondary 
file.  This  is  done  so  that  the  secondary  data  files  can  be  totally  removed 
without  aftecting  the  ability  of  the  query  software  to  follow 
set/relationship  next  pointers  throughout  the  data  base. 

The  cell  size  for  the  attributes  file  should  be  at  least  22  bytes 
long  for  the  groups  as  presently  defined.  We  chose  a  record  size  of  24 
bytes . 


Data  Available  Mask 

The  data  available  mask  (defined  in  the  primary  record)  contains 
information  about  which  dictionary  terms  have  data  values  stored  with  this 
position.  The  bit  number  within  the  mask  corresponds  to  the  term  id  number 
assigned  to  the  term  when  it  is  added  to  the  dictionary.  Bit  number  0 
(zero)  is  reserved  for  future  use.  Bit  number  1  must  be  used  to  define  the 
POSITION(S)  term.  Unfortunately,  this  scheme  is  not  as  implementation 
dependent  as  the  prime  number  scheme  originally  designed  but  it  does  allow 
faster  access  and  more  immediate  growth  potential. 


Figure  1 


Primary  File  Contents 


Position/Time  Record 
Contents 


Record  type  (=1) 

Next  local  pointer 

Reserved__A 

Year 

Day  and  hour 
Minute 
Time  zone 
Latitude 
Longitude 

Z-coordinate  or  height  (reserved) 
Cumulative  distance 
Data  available  mask 
Ship  table  owner  pointer 
Pointer  to  secondary  data 


Pointer  Record  (subcode  -  1) 


Contents  Length  Starting 

in  bytes  byte 

Record  type  (*2)  2  1 

Next  local  pointer  4  3 

Reserved__A  2  7 

Record  subtype  (=1)  2  9 

Reserved__B  4  11 

Next  pointer  in  degree  square  4  15 

Resei''ed_C  2  19 

Next  tile  name  code,  cruise  leg  4  21 

Next  pointer  for  cruise  leg  4  25 

Reserved^  2  29 

Next  file  name  code,  abbreviated 

navigation  stream  4  31 

Next  pointer  for  abbreviated 

navigation  stream  4  35 

Reserved__E  2  39 

Reserved  F  8  41 


Length 
in  bytes 

2 

4 

2 

1 

2 

2 

2 

4 

4 

4 

4 

8 

4 

4 


Starting 

byte 

1 

3 

7 

9 

10 
12 
14 
16 
20 
24 
28 
32 
40 
44 


Figure  1  (continued) 


Counter  record  (subcode  ■  n) 


Contents 


Length  Starting 

in  bytes  byte 


Record  type  (*3)  2 

Ucxt  local  pointer  4 

Reserved_A  2 

Record  subcode  -  n,  2 

defined  by  variable  id  //  from 
the  dictionary 

Number  of  points  in  degree  square  4 

in  this  cruise  leg 

backwards  pointer  to  ship  table  4 

record  type  2 

number  of  times  queried  4 

Date  last  queried  5 

bate  last  updated  5 

File  code  for  first  position  4 

pointer 

Record  number  for  first  position  4 

File  code  tor  last  position  4 

pointer 

Record  number  for  last  position  4 


1 

3 

7 

9 


11 


15 


19 

23 

28 

33 


37 

41 


45 
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Figure  2 


Secondary  File  Contents 


Navigation  detail 

Length 
in  bytes 

Starting 

byte 

record  type  (=  3) 

2 

1 

next  Local  pointer 

A 

3 

reserved 

2 

7 

current  speed  east-west 

2 

9 

current  speed  north-south 

2 

11 

navigation  quality  code 

1 

13 

navigation  type  code 
platform  (ship)  speed  east- 

1 

1A 

west 

platform  (ship)  speed  north- 

2 

15 

south 

2 

17 

bathymetry  detail 

Length 
in  bytes 

Starting 

byte 

record  type  (=  3) 

2 

1 

next  local  pointer 

4 

3 

reserved 

2 

7 

depth  correction  code 

1 

9 

depth  quality  code 

1 

10 

depth  type  code 
uncorrected  depth  (travel 

1 

11 

time) 

4 

12 

Magnetics  detail 

Length 
in  bytes 

Starting 

byte 

record  type  (=7) 

2 

1 

next  local  pointer 

A 

3 

reserved 

2 

7 

diurnal  correction 

2 

9 

magnetic  quality  code 

1 

11 

magnetic  sensor  altitude 

2 

12 

sensor  used  code 
geomagnelic  total  field. 

1st 

1 

1A 

sensor 

geomagnetic  total  field. 

2nd 

A 

15 

sensor 

A 

19 
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Figure  2  (continued) 


Gravity  detail 


record  type  (*  11) 
next  local  pointer 
reserved 
Bouguer  anomaly 
Eotvos  correction 
gravity  quality  code 
observed  gravity 


Length  Starting 

in  bytes  byte 


2 

4 

2 

2 

2 

1 

4 


1 

3 

7 

9 

11 

13 

14 


Seismic  detail 


record  type  (*  13) 
next  local  pointer 
reserved^A 

seismic  shot  identification 
reserved  B 


Length  Starting 

in  bytes  byte 


2 

4 

2 

12 

4 


1 

3 

7 

9 

21 


Geophysical  detail 


Length  Starting 

in  bytes  byte 


record  type  (*  2) 
next  local  pointer 
reserved_A 
corrected  depth 
geomagnetic  anomaly 
free  air  anomaly 
reserved  B 


2 

4 

2 

4 

2 

2 

8 


1 

3 

7 

9 

13 

15 

17 
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S ET / REL  AT 1 ON SH IP / TABLE  STRUCTURE 


Revision  Date:  11  August  1981 


Bounas  Table 

One  of  the  basic  features  of  any  data  base  management  system  is  that 
an  entity  may  participate  in  one  or  more  relationships  (or  sets)  with  other 
entities.  In  the  data  base  optimized  for  location-dependent  data  it 
follows  that  geographical  proximity  is  one  such  relationship*  The 
organization  chosen  to  implement  this  relationship  is  a  two  level 
structure,  called  the  bounds  table. 

The  first  level  in  the  bounds  table  data  structure  consists  of  648 
file  names  corresponding  to  the  648  different  10  degree  by  10  degree 
geographical  shapes  on  the  world* s  surface.  (We  should  not  call  these 
shapes  squares  because  of  the  spherical  shape  of  the  earth).  Each  file 
name  is  defined  by  the  particular  10  degree  by  10  degree  area  it  refers  to, 
as 


BND  (N/S)  p  (E/S)  qq.TBL 

wuere  N  means  north,  S  means  south,  E  means  east,  and  W  means  west;  p  can 
take  the  values  0  through  9;  and  qq  can  take  the  values  between  00  and  18. 
Leading  zeros  must  be  present.  This  file  naming  convention  makes  it  easier 
to  identify  which  10  degree  area  a  table  refers  to. 

In  order  to  save  disk  storage,  if  no  data  exist  in  a  10  by  10  degree 
area  then  the  corresponding  bounds  table  as  defined  above  will  not  exist. 

The  second  level  in  the  bounds  table  data  structure  contains  the 
contents  of  these  files.  Each  file  contains  an  entry  for  each  one  degree 
shape  ("square")  within  the  10  degree  shape.  The  formula  for  placement  or 
retrieval  of  a  specific  1  degree  entry  is 

linear  position  =  ABS(latitude  units)  x  10 
+  ABS(longitude  units)  +  2 

This  formula  causes  a  progression  from  left  to  right  (west  to  east)  and 
then  bottom  to  top  (south  to  north)  for  positive  latitudes  and  longitudes. 
There  will  always  be  at  least  101  entries  in  the  file  although  many  entries 
may  indicate  that  no  data  exists  in  that  one  degree  shape. 

As  an  example,  consider  the  10  degree  shape  whose  upper  right  corner 
is  at  10  degrees  south  and  150  degrees  west.  The  corresponding  bounds 
table  is  called 


BNDS1W15.TBL 
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The  data  located  within  the  boundaries  equal  to  or  less  than  10  degrees  south 
but  greater  than  20  degrees  south,  and  equal  to  or  less  than  150  degrees  west 
but  greater  titan  160  degrees  west  would  be  accessible  through  this  file.  In 
tli is  case  tne  table  name  refers  to  the  upper  right  corner  because  we  have  a 
negative  latituue  and  longitude. 

The  contents  of  an  entry  in  the  bounds  table  include  the  number  of  points 
witnin  the  one  degree  shape.  A  local  pointer  is  present  which  points  to 
additional  information  about  this  degree  square.  The  additional  information 
is  maintained  in  a  linked  list  structure.  The  records  in  this  structure 
maintain  counts  and  pointers  for  each  cruise  leg  as  it  enters  and  leaves  the 
degree  shape.  Records  for  each  data  type  indicate  the  total  number  of  points 
available  in  the  degree  shape  for  each  data  type.  Records  containing 
additional  information  are  kept  in  this  same  file,  after  record  number  101. 
i he  formats  for  the  various  48  byte  record  types  follow: 

Record  Number  1: 


ikecord  type 

i 

JNext 

9 

• 

iCreation 

»  !  * 

iDate  last  I  I 

:  =0 

.'free  record 

l  date 

1  updated  I  • 

l integer*2 

l integer*4 

t 

i  5  bytes 
? 

l  5  bytes  I  I 

•  9  1 

•  »  • 

Record  type  2:  for  records  2  through  101,  count  of  the  total  number 
of  navigational  records  in  the  degree  square 


Record  type  (-2) 

Next  local  pointer 
Next  record  type 
Record  subcode  *  L, 

Total  number  ot  navigation 
in  the  degree  shape 
Reserved  A 

Number  of  times  queried 

Date  created 

Date  last  updated 

Reserved  B 

Reserved_C 

Reserved  D 

Reserved  E 


2  1 

4  3 

2  7 

2  9 

points  4  11 

4  15 

4  19 

5  23 

5  28 

4  33 

4  37 

4  41 

4  45 


Record  numbers  greater  than  101  contain  information  about  each  cruise  log 
as  the  ship's  track  enters  and  leaves  a  degree  square  (record  type  3)  as  well 
as  a  lotals  count  for  each  data  type  available  in  the  degree  square  (record 
type  2,  with  subcode  greater  than  1). 
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Record  type  2:  totals  count  for  the  data  type  code  specified  by  the  subcode 


Record  type  (*2) 

2 

1 

Next  local  pointer 

4 

3 

Next  record  type 

2 

7 

Record  subcode  =  n, 

2 

9 

Total  number  of  data  points 
in  the  degree  shape 

4 

11 

Reserved  A 

4 

15 

Number  of  times  queried 

4 

19 

bate  created 

5 

23 

Date  last  updated 

5 

28 

Maximum  va iue  ( * ) 

4 

33 

Minimum  value  (*) 

4 

37 

Reserved  D 

4 

41 

Reserved  £ 

4 

45 

Record  type  3:  pointer,  counter  and  statistics  for  cruise  leg  navigation 
records  each  time  a  cruise  enters  this  degree  square* 


Record  type  (=3)  2  1 

Next  local  pointer  4  3 

Next  record  type  2  7 

Record  subcode  *  1,  2  9 

Number  of  points  in  degree  square  4  11 

in  this  cruise  leg  segment 

Backwards  pointer  to  ship  table  4  15 

record  type  2 

Number  of  times  queried  4  19 

Date  last  queried  5  23 

Date  last  updated  5  28 

File  code  for  first  position  4  33 

pointer 

Record  number  for  first  position  4  37 

File  code  for  last  position  4  41 

pointer 

Record  number  for  last  position  4  45 


j  Values  are  stored  as  real*4,  in  the  original  units.  Complex  values 
have  only  their  real  values  stored.  Character  string  variables  cause  a  value 
of  0  ^zero)  to  be  stored. 
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This  two  level  scheme  allows  for  the  fact  that  there  will  be  some  10 
degree  shapes  containing  no  data*  Therefore,  less  than  the  full  number  of 
648  file  names  will  be  needed*  With  this  scheme  there  is  little  overhead 
due  to  sparse  data  sets* 

Any  data  residing  on  the  International  Date  Line  is  taken  to  be  at 
180  degrees  west  rather  than  180  degrees  east.  A  latitude  or  longitude  of 
exactly  zero  is  treated  as  a  positive  number* 
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Ship  Table 


Another  relationship  that  geophysLcal  measurements  participate  in  is 
the  commonality  of  collection  platform.  Typically,  this  platform  is  an 
ocean-going  ship.  While  different  organizations  have  chosen  unique  ways  to 
identify  data  collected  by  one  ship  during  one  collection  period,  a  common 
approach  is  to  call  the  collection  period  from  one  port  stop  to  another  a 
leg.  It  is  also  common  to  group  these  legs  together  into  a  cruise,  where  a 
cruise  begins  and  ends  at  the  home  port.  Each  leg  is  numbered  sequentially 
with  the  first  leg,  leg  1,  departing  from  the  home  port. 

Since  it  ;  s  very  common  for  researchers  to  request  data  from  one  or 
more  specific  cruise  legs,  the  data  base  scheme  uses  a  ship  table  (similar 
to  the  bounds  table)  to  facilitate  such  request  retrievals.  Besides 
holding  a  pointer  (file  name  and  record  number)  to  the  first  position 
record,  the  ship  table  holds  important  information  about  the  cruise 
including  the  unique  cruise  name,  start  and  end  dates,  chief  scientist  name 
during  the  leg,  number  of  data  points  collected,  project  name,  type  of 
instrumentation  used,  types  of  data  collected,  etc. 

The  bounds  table  (described  above)  and  the  ship  table  could  be 
intertwined  by  identifying  whenever  a  cruise  leg  enters  and  leaves  a  degree 
snape.  In  such  a  scheme  the  ship  table  would  contain  a  pointer  to  a  linked 
iist  of  pointers.  These  pointers  would  point  to  the  first  position  inside 
a  new  degree  shape,  following  the  cruise  track.  The  bounds  table  would 
contain  a  pointer  to  a  similar  linked  list  but  in  this  case  all  the 
positions  referred  to  would  lie  in  the  same  degree  shape.  Advantages  of 
this  scheme  include:  a  simple  way  of  identifying  when  data  within  a 
particular  degree  shape  belong  to  a  particular  cruise  leg;  and  an  easy 
method  to  respond  to  queries  about  which  cruises  and  how  many  data  points 
are  available  in  specific  areas.  A  disadvantage  is  that  simplicity  is 
reduced  uy  forever  tying  together  two  independent  relationships.  Adding 
new  relationships  could  overly  complicate  the  data  base  structure.  Because 
of  tuese  disadvantages  this  scheme  is  not  used. 

Insertion  and  retrieval  from  the  ship  table  is  via  a  simple  hashing 
algorithm  similar  to  the  one  used  in  the  dictionary  structure.  Based  on 
current  data  holdings  and  estimated  insertion  rates,  the  ship  table  need  be 
only  307  entries  long,  resulting  in  a  hash  table  less  than  60  percent 
full.  However,  like  the  dictionary  file,  the  ship  table  has  a  primary  and 
secondary  part.  The  contents  of  the  primary  part  are  identified  as  record 
type  i.  The  secondary  part  contains  further  information  about  the  cruise. 
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ihe  formats  for  the  various  48  byte  record  types  in  the  ship  table  file 
called  SHIP .TBL  follow. 

Record  number  l: 


1 

reserved  I First  pointer 

t 

IModulo 

i  » 

•  • 

iLocal  pointer IReserved 

Next  free  1 

2  bytes  1  to  alpha- 

lvalue  for 

Ifor  details  12  bytes 

record  in  • 

»  ibetical  list 

1 hashing 

2integer*4  1 

second  section  1 

! iuteger*4 
* 

1 integer*4 

i 

t  » 

.  • 

i  » 

•  » 

integer*4  1 

Record  type  1:  Ship  name 


Reserved 

« 

Ship  2  Next 

Details 

Flags 

Security 

2  bytes 

identif i-  2 pointer 

pointer 

code 

*2 

cation  lin  alpha- 

34  bytes  Ibetically 

2  sorted 

1  linked 

2  list 

2  integer 

1  *4 

i 

integer*4 

Bit  1  for 
synonym 

2  bytes 
(16  bits) 

integer*2 

Note  that  the  pointer  to  the  first  data  point  in  the  cruise  does  not  appear 
uere.  It  will  appear  in  a  linked  list  structure  defined  in  record  type  2  of 
the  ship  table. 

Record  type  2  -  cruise  start  and  end  dates,  date  ship  table  entry  was  made, 
and  backwards  pointer  to  ship  id. 


Record 

Next 

Record 

Date  the 

Start 

End  date 

type 

record 

type  of 

entry  is 

date  of 

of  cruise 

-2 

in  linked 

next 

made  in  the 

cruise  leg 

leg 

integer 

list 

record 

table 

year 

year 

*2 

integer 

in  list 

5  bytes 

hour /day 

}iour/day 

*4 

integer 

minute 

minute 

*2 

(5  bytes) 

(5  bytes) 

Backwards  pointer 
to  ship  name 
integer*4 


The  backwards  pointer  is  made  available  so  that  the  first  part  of  the 
ship  table  can  change  in  size  without  having  to  change  each  position.  (Each 
position  record  contains  a  backwards  pointer  to  its  corresponding  ship 
identification.)  But  in  order  for  this  to  be  useful,  the  ship  table  detail 
section  must  g^ow  from  the  end  of  the  preallocated  file  backward  towards 
record  one. 

Record  type  3  -  pointer,  counter  and  statistics  for  data  parameters, 
identified  by  the  subcode  value  n 


Record  type  (=3)  2  1 

Next  local  pointer  4  3 

Next  record  type  2  7 

Record  subcode  =  n,  2  9 

Number  of  points  of  this  data  4  11 

type  in  cruise  leg 

Backwards  pointer  to  ship  table  4  15 

record  type  2 

Number  of  times  queried  4  19 

Date  last  queried  5  23 

Date  iast  updated  5  28 

File  code  for  first  position  4  33 

pointer 

Record  number  for  first  position  4  37 

File  code  for  last  position  4  41 

pointer 

Record  number  for  last  position  4  45 


As  soon  as  any  data  are  added  to  the  data  base,  there  must  exist  a  record 
type  3  for  the  position  information.  The  record  subtype  (or  subcode)  for  the 
position  information  is  1.  There  exists  a  record  type  3  for  each  data  type 
available  in  the  cruise  leg.  The  subcodes  for  these  data  types  are  the  same 
as  tneir  unique  dictionary  identifying  numbers.  These  records  are  usually 
placed  immediately  after  the  record  type  2. 

Figure  3  summarizes  the  secondary  ship  table  types  5  through  19.  All 
have  the  same  format  as  the  example  given  in  the  figure. 
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Figure  3 


Secondary  Ship  Table  Parameters 


Desert pL ion 

Record  type 

Subcode 

Port  stops 

5 

0 

Chief  scientist 

7 

0 

Contributor 

11 

0 

Project  name 

13 

0 

Comments 

17 

0 

MGD77  header 

19 

1  thru  24 

Record 

iiext 

Record 

» 

• 

Subcode  IPort  stops 

type  3 

local 

type  of 

integer*2  I 

integer 

pointer 

next 

! (continued  if  needed) 

*2 

*4 

record 

t 

0 

in  list 

138  characters 

integer 

9 

• 

*2 

I 

• 

t 

• 

Record  type  23  is  used  to  store  the  overall  latitude  and  longitude  bounds 
for  the  c.uise  leg.  The  values  are  stored  in  compact  format.  The  contents  of 
record  type  23  follow: 

t\ecoru  type  23  -  overall  latitude  and  longitude  bounds  for  the  cruise  leg 


Record  type  (=23) 

2 

1 

Next  local  pointer 

4 

3 

Next  record  type 

2 

7 

Reserved  A 

2 

9 

Maximum  latitude 

4 

17 

Minimum  latitude 

4 

21 

Maximum  longitude 

4 

25 

Minimum  longitude 

4 

29 

Reserved  B 

4 

33 

Reserved  C 

4 

37 

Reserved  D 

4 

41 

Reserved  D 

4 

45 

USER  DEFAULT  FILE 


6  June  1979 


The  user  default  file  contains  attributes  about  each  user  of  the  DBLD. 
These  attributes  are  used  to  define  the  default  options  and  conditions  for 
various  segments  of  the  retrieval  software.  The  default  file  name  is 
contained  in  the  validation  file. 


UMT  REFERENCE  NUMBER  ASSIGNMENTS 


Revision  Date:  6  April  1980 


A  number  of  unit  reference  numbers  are  permanently  reserved.  These  are 

validation  file  —  20 

dictionary  file  —  21 

history/accounting  file  —  22 

help  file  —  23 

ship  table  file  —  24 

bounds  table  file  —  23 
reserved  —  26  —  39 

(for  the  tables) 

Unit  reference  numbers  from  40  through  49,  and  50  through  59  are 
available  for  assigning  to  various  data  files,  corresponding  to  one  degree 
squares  of  data.  The  numbers  in  the  forties  are  to  be  used  for  primary  data 
tiles;  tiie  numbers  in  the  fifties  are  to  be  used  for  the  secondary  data 
tii.es.  Unit  reference  numbers  10  through  19  are  used  in  short  term 
situations,  usually  within  one  software  module. 
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HISTORY  AND  ACCOUNTING  STATISTICS 


Revision  Date:  14  April  1980 


Lach  time  a  data  base  user  begins  a  DBLD  procedure  an  initial  entry  is 
made  in  tue  history /accounting  file.  Additional  entries  (see  Appendix  3)  are 
oia^e  at  various  times  indicating: 

data  base  modules  used 
which  data  are  accessed 
number  of  points  read 


At  the  end  of  the  program  a  final  entry  is  made  in  the  history/accounting 
file*  Using  the  information  stored  from  the  first  and  last  entries,  it  is 
possible  to  compute  the  number  of  buffered  input/output  operations,  CPU  time, 
direct  input/output  operations,  elapsed  time,  and  page  faults* 

These  entries  do  not  reflect  the  feffortf  required  to  bring  the  program 
to  the  point  where  it  writes  the  first  entry  into  the  history/accounting 
file.  For  example,  in  the  case  of  program  UPDBDIC  (UPdate  the  Data  Base 
Dictionary),  in  a  test  run  made  19  October  1979,  the  following  differences 
were  noted: 

Computed  TIMRB/TIMRE 


elapsed  timefsec.) 

547.63 

555.94 

cpu  time  (sec.) 

2.13 

2.31 

direct  I/O 

57 

59 

buffered  I/O 

281 

294 

page  faults 

192 

256 

ihe  above  figures  indicate  that  the  history/accounting  file  is  adequate  for 
our  purposes. 

These  statistics  can  be  used  to  compute  a  charge  for  the  services 
performed  by  the  DBLD  software  as  well  ascertain  use  patterns.  They  can  be 
further  studied  to  help  reorganize  and  modify  the  data  stored  in  the  data  base. 
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RETRIEVAL  COMMAND  FORMAT 


Revision  Date:  26  October  1979 


The  general  format  for  a  retrieval  command  is 


i  »»»  t  • 

•  •  •  •  •  • 

verbl/switch  I  Zobject  Z,  Z... 

i  iii  ii 

•  •  *  •  *  • 


where  'verb*  is  an  acceptable  retrieval  command  action  verb  such  as  DISPLAY, 
DEFINE,  uELP ,  LIST,  SHOW  etc.;  '/switch'  is  zero,  one  or  more  verb  modifiers 
separated  by  a  slash;  and  ’object'  is  zero,  one  or  more  parameters  separated 
by  a  comma. 

For  example,  to  obtain  the  definition  of  the  dictionary  terms  LONGITUDE 
and  YEAR  enter  the  command 


DEFINE  LONGITUDE, YEAR 

In  this  case,  DEFINE  is  the  verb  and  LONGITUDE  and  YEAR  are  objects.  There 
are  no  modifiers.  Since  unique  abbreviations  are  permitted  for  verbs  and 
switches  this  command  could  be  written  as 

DEF  LONGITUDE , YEAR 

In  general,  objects  cannot  be  abbreviated. 

Consult  the  help  files  for  a  current  list  of  commands  and  options. 
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Appendix  1 

Revision  Date:  4  Jan  1982 


Definition  of  Status  Codes 


Meaning 

no  error,  operation  successful 

some  error  has  occurred 

two  dates  are  not  the  same 

file  open  error 

direct  read  error 

direct  write  error 

end  of  file  condition 

record  not  found  in  a  linked  list 

file  does  not  exist 

array  index  out  of  range  in  MAXMINVAL 
No  valid  data  values  contained  in  this 
detail  record.  That  is,  all  data 
values  are  the  null  value, 
write  error  to  output  file 
incorrect  record  type 

unknown  or  not  implemented  record  type 
more  than  64  occurrences  of  record 
type  3  in  ship  table  for  current 
cruise-leg. 

COPY  operation  cancelled  by  user 
COPY  operation  cancelled  because  no 
cruises  specified  and  bounds  too 
large  (more  than  100  10-degree 
squares) . 

error  in  GETNXTPNO  -  existing  file 
could  not  be  opened . 
option  not  available  (PRSSWTCH) 
abbreviation  not  unique  (PRSSWTCH) 
end  of  library  query  procedure 


Appendix  2 


Revision  Date:  5  Oct  1981 


Counter/Statistic  Assignment  Numbers 


Meaning 


Counter  Number 


data  base  primary  records  output 
data  base  detail  paiameters 

number  of  navigation  records  input 
number  of  times  primary  files  opened 
for  append 

number  of  times  detail  files  opened 
for  append 

number  of  times  bounds  tables  opened 
number  of  times  detail  files  opened 
for  lead  only 

number  of  times  primary  files  opened 
for  read  only 

number  of  times  primary  files  opened 

for  read  and  write 

number  of  times  detail  files  opened 

for  read  and  write 

number  of  primary  files  created  and 
opened  for  sequential  appending 
number  of  detail  files  created  and 
opened  for  sequential  appending 


1 

numbers  as  assigned 
the  dictionary 

130 

131 

132 

133 

134 
133 

136 

137 

138 

139 
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Appendix  3 

Revision  Date:  24  Aug  1981 
Formats  for  Messages  to  History  File 


Routine  Name 


Contents 


ADDNEWUSR 

DBOPNUPD 


Authorized  new  user1,  followed 
by  the  user  name* 

•***START  STATISTICS***1 


RDSTBLCTS  ship  owner  pointer  to  record  type  2, 

start  location  for  reading  and  subcode 
value  equal  to  the  data  parameter 
identification  number,  in  the  format 
(I12,2X,I12,2X,I6) 
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Appendix  4 


Revision  Date:  23  July  1981 


Setting  File  Access  Protection 


On  the  VAX-11/ 780  computer,  computing  access  to  user  files  is  controlled 
by  setting  the  protection  levels  for  each  file.  Certain  files  in  this  library 
need  to  be  accessed  (read  only)  by  all  users  while  other  files  may  need  to 
allow  writing  to  them  by  all  users.  The  following  summary  describes  the 
procedures  to  ioIIow  in  order  to  set  the  correct  access  authorization  for 
tiles  on  the  VAX  computer. 

The  account  under  which  the  library  or  dictionary  updates  are  made  should 
have  as  the  default  authorization: 

system  -  read,  write,  execute,  delete 

owner  -  read,  write,  execute,  delete 

group  -  read,  execute 

world  -  read,  execute 

Using  these  protection  levels  will  allow  everyone  to  have  read  access  to  the 
data  and  library  files  and  to  have  execute  privileges  for  the  programs.  Note 
that  all  program  sources  should  be  made  secure  by  allowing  only  the  owner  to 
have  read  access  to  the  source  files. 

These  default  authorization  parameters  are  appropriate  for  all  except 
certain  files.  The  authorization  for  the  history /accounting  file  must  be  set 
to  the  following: 

system  -  read,  write 
owner  -  read,  write,  delete 
group  -  read,  write 
world  -  read,  write 

To  cliange  the  aut horizat ion  parameters  on  the  history/accounting  file  issue 
the  lol lowing  VAX  command: 

$SET  PROTECTI ON- ( SYSTEM : RW .OWNER: RWED , GROUP : RW .WORLD : RW) 

BIGA: [ DBDICT ] DB HISTORY . ACC 
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Appendix  5 


Revision  Date:  6  June  1980 


Validation  File 


The  validation  file  contains  a  list  of  authorized  users  of  the  library 
system.  There  exist  three  levels  of  authorization  as  follows: 


Level  // 


Authority 


1  all  actions  permitted 

2  all  actions  permitted  except 

authorizing  new  users  to  the  libary. 
Hence,  this  user  can  add  new  date, 
delete  and  modify  data. 

3  read  only  access  to  the  library  is 

permitted  this  user  level.  This  is 
considered  the  proper  authorization 
for  a  user  expected  to  do  typical  data 
retrieval  and  queries. 


The  format  for  an  entry  in  the  validation  file  follows: 


user  name 
data  authorized 
validat  Lon  level 
security  level 
user  default  file 


12  characters,  left  justified 
16  characters 
integer*2  (encoded  as  14) 
integer*2  (encoded  as  12) 

40  character  file  specification 


The  security  level  is  used  to  control  access  to  specific  cruise  legs  of 
data.  Access  is  permitted  to  any  cruise  leg  whose  security  code  (as  stored  in 
the  ship  table)  is  equal  to  or  lower  than  the  user’s  security  level  value. 

whenever  a  procedure  is  initiated  (i.e.  a  program  is  ’run')  a  check  is 
made  a0ainst  the  validation  file.  The  user  at  his  or  her  terminal  (or  in 
batch)  must  be  authorized  at  the  appropriate  validation  level  in  order  to 
continue  the  cuosen  task. 
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Revision  Date:  24  August  1981 


System  Manager's  Guide 


System  Start-up  Procedures 

Run  program  1NITDBLD  creating  the  dictionary,  history,  and  validation 
tiles  as  necessary*  Be  sure  to  follow  the  file  access  instructions  in 
Appendix  4* 


Sample  run  of  INITDBLD: 

$  RUN  INITDBLD.EXE 


Y 

(Yes,  create  a  new 
dictionary) 

0 

(Modulo  value,  0  is 
default  value) 

(Network  name  for  data 
values ) 

BIGA: 

(Disk  name  for  data 
values) 

[DBDATA] 

(Directory  name  for  data 
values) 

I  2 

(Network  name  for  bounds 
table) 

BIGA: 

(Disk  name  for  bounds 
table) 

[DBTBLE] 

(Directory  name  for 
bounds  table) 

: : 

(Network  name  for  ship 
table) 

BIGA: 

(Disk  name  for  ship 
table ) 

[DBTBLE] 

(Directory  name  for  ship 
table) 

Y 

(Yes,  create  a  new 
history /accounting  file) 

Y 

(Yes,  create  a  new 
validation  file) 
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Y  (Yes,  create  a  new  ship 

table) 

0  (Modulo  value  for  the 

ship  table  hashing 
function.  0  Is  default 
value ) 

Run  program  UPDBDIC.  This  program  updates  the  dictionary  structure  with 
the  aata  attributes  which  will  be  stored  in  the  library.  The  first  entry  must 
ue  the  term  POSITION(S)  as  follows: 


POSITION(S) 

N 

1 

1 

14 

12 

INTEGER 

INTEGER 


(new  term) 

(not  a  synonym) 
(group  #) 

(primary  file  type) 
(starting  byte  #) 
(length) 


the  latitude,  longitude  and  height  (z  coordinate) 
above  or  below  sea  level  identifying  the 
geographical  location  on,  over  or  below  the  earth’s 
surface 


degrees  for  latitude  and  longitude,  meters  for  height 

0  (multiplication  factor) 

0  (offset) 

0  (lowest  value) 

0  (highest  value) 

End  the  input  with  a  Control  Z. 


You  should  note  and  record  what  group  numbers  are  assigned  to  each 
secondary  data  attribute  so  that  you  can  generate  the  correct  read/write 
routine  for  each  record  type  (identified  by  group  number).  Since  the  software 
does  not  automatically  control  how  data  are  stored,  you  must  decide  how  to 
store  the  secondary  data  attributes.  Then,  store  the  data  attribute  names  in 
an  order  that  will  cause  the  group  number  to  be  assigned  correctly.  Whenever 
you  enter  a  0  vzero)  for  the  group  number,  the  system  automatically  assigns 
the  next  available  gtoup  number  to  the  attribute  being  entered* 
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Periodic  Maintenance 


The  history  file,  DBHISTORY . ACC ,  as  defined  during  the  system  start-up, 
should  be  listed  and  removed  from  disk  onto  magnetic  tape  whenever  it  becomes 
large  or  whenever  it  is  time  to  do  the  charge-back  operation*  You  must  create 
a  new  history/accounting  file  using  the  INITDBLD  program  and  change  the 
protection  for  this  file  as  outlined  in  Appendix  4. 

Review  the  history  files  to  see  what  secondary  data  might  be  removed  from 
the  system  without  affecting  too  many  users.  This  procedure  will  minimize  the 
cost  of  maintaining  an  on-line  data  library. 

Review  the  status  of  the  bounds  tables  after  each  program  DBINSERT  run  to 
insure  that  sufficient  space  is  available  for  more  new  data.  Also  check  the 
ship  table  for  available  space.  Both  primary  and  secondary  files  will  grow  as 
needed . 


Authorizing  New  System  Users 

Hew  users  are  authorized  for  data  insert  and/or  retrieval  by  using 
program  DBVALID.  The  program  requests  the  the  following  input: 


User  name 


enter  the  VAX  user  name  as  assigned  by 
the  VAX  operations  staff. 


Validation  level  # 


see  appendix  5  for  a  description  of 
the  validation  levels 


Security  level  #  see  appendix  5  for  a  description  of 

the  security  levels. 

User  default  file  name  this  file  name,  complete  with  disk, 

directory  and  file  name,  is  executed 
via  the  @  command  by  DBQUERY  whenever 
this  user  begins  execution  of  the 
DBQUERY  program.  The  file  must  exist 
even  if  it  contains  no  commands. 


When  the  user  validation  file  is  first  created,  by  running  program 
INITDBLD,  the  username  of  the  process  executing  this  program  is  granted  system 
manager  status.  This  user  is  authorized  to  add  new  users  to  the  system. 
Additional  people  can  be  granted  system  manager  status  by  assigning  the 
appropriate  authorization  level  number.  The  file  MANAGER. DBQ  is  set  up  to  be 
the  command  file  executed  when  DBQUERY  is  first  executed  for  this  user. 
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Revision  Date:  13  January  1982. 
Program  Version  Number  and  Date 


Program  Name 

Version  Number 

Date 

DBINSFR7 

1.10 

8  January  19g] 

dbourry 

1.10 

11  July  IQft? 

DRVALID 

1  .00 

8  June  1980 

SHPTBLTNS 

1.10 

11  January  1 98 

UPDBDIC 

1.00 

October  1979 

Appendix  8 


Revision  Date:  18  September  1981 


Steps  to  Add  New  MGD  77  Data 


1-  Insure  the  ciuise  leg  identifier  has  been  added  to  the  shiptable  via 
the  SHPTBL1NS  program. 

2.  Run  program  CHECKMGD  to  check  for  time  and  navigation  errors  in  MGD 
xnput  format. 

1.  Create  a  batch  job  for  DBINSERT  under  the  GROMAN  account.  Unit  l  is 
assigned  to  the  input  data  in  MGD  77  format  as  follows: 

$  ASSIGN  A2093L01.MGD  FOROOl 

$  kUN  DBINSERT 

A2093L01 

4.  Submit  the  batch  job  to  run  at  night,  during  class  3.  One  leg  takes 
about  20  minutes  of  CPU  time. 

5.  After  successful  DBINSERT,  do: 

a.  Backup  of  data  base  files  (DBSAVE.COM) 

b.  Backup  of  DBINSERT.BAT  and  MGD  data  files  (MGDSAVE.COM) 

c.  Delete  MGD  data  files  from  disk. 

6.  If  DBINSERT  job  was  not  successful,  restore  data  base  files  from  most 
recent  suveset  using  DBRESTORE.COM  command  procedure. 
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